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ABSTRACT 

We examine the link between quasars and the red galaxy population using a model for the self-regulated 
growth of supermassive black holes in mergers involving gas-rich galaxies. In this picture, mergers drive 
nuclear inflows of gas, fueling starbursts and obscured quasars until feedback energy from black hole growth 
expels the surrounding gas, rendering the quasar briefly visible as a bright optical source. The quasar dies when 
there is no longer a significant supply of gas to power accretion, and the stellar remnant relaxes as a passively 
evolving spheroid satisfying the Mbh- cr relation and lying on the fundamental plane. The same process that 
halts black hole growth also terminates star formation in the remnant, accounting for the observed red galaxy 
population in the bimodal color/morphology distribution of galaxies. Using a model for quasar lifetimes and 
evolution motivated by hydrodynamical simulations of galaxy mergers, we de-convolve the observed quasar 
luminosity function at various redshifts to determine the birthrate of black holes of a given final mass. Identi- 
fying quasar activity with the formation of spheroids in the framework of the merger hypothesis, this enables 
us to infer the corresponding birthrate of spheroids with given properties as a function of redshift. With this 
method, we predict, for the red galaxy population, the distribution of galaxy velocity dispersions, the galaxy 
mass function, mass density, and star formation rates, the luminosity function in many observed wavebands 
(e.g., NUV, U, B, V, R, r, I, J, H, K), the total number density and luminosity density of red galaxies, the 
distribution of colors as a function of magnitude and velocity dispersion for several different wavebands, the 
distribution of mass to light ratios as a function of mass, the luminosity-size relations, and the typical ages and 
distribution of ages (formation redshifts) as a function of both mass and luminosity. For each, we predict the 
evolution at redshifts z = 0-6 and, in each case, our results are in good agreement with observational estimates. 
However, we demonstrate that the predictions strongly disagree with observations if idealized, traditional mod- 
els of quasar lifetimes are adopted in which these objects turn on and off at a fixed luminosity or follow simple 
exponential light curves, instead of the more complicated quasar evolution implied by our simulations. 
Subject headings: quasars: general — galaxies: nuclei — galaxies: active — galaxies: evolution — cosmology: 
theory 



1. INTRODUCTION 

Hierarchical theories of galaxy formation and evolution in- 
dicate that large systems are built up over time through the 
merger of smaller progenitors. Galaxy interactions in the lo- 
cal Universe motivate the "merger hypothesis" (Toomre & 
Toomre 1972; Toomre 1977), according to which collisions 
between spiral galaxies produce the massive ellipticals ob- 
served at present times, a view supported by self-consistent 
modeling of mergers (for reviews, see e.g. Barnes & Hern- 
quist 1992; Barnes 1998). Furthermore, it is believed that 
most galaxies harbor supermassive black holes (e.g. Kor- 
mendy & Richstone 1995; Richstone et al. 1998; Kormendy 
& Gebhardt 2001) and that the masses of these black holes 
correlate with either the mass (Magorrian et al. 1998) or 
the velocity dispersion (i.e. the Mbh-ct relation: Ferrarese & 
Merritt 2000; Gebhardt et al. 2000) of their host spheroids, 
demonstrating that the growth of supermassive black holes 
and galaxy formation are linked. Simulations of the self- 
regulated growth of black holes in galaxy mergers (Di Mat- 
teo et al. 2005) have shown that the energy released by this 
process can have a global impact on the structure of the rem- 
nant, implying that models of galaxy formation and evolution 
must account for black hole growth in a fully self-consistent 
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manner. 

Based on surveys such as SDSS, 2dFGRS, COMBO-17, 
and DEEP, there is mounting evidence that the color distribu- 
tion of galaxies at z = is bimodal (e.g. Strateva et al. 20011 
Blantonetal. 2003; Kauffmann et al. 2003a; Baldry et aU 
2004; Baloghetal. 2004), and can be well fitted by two 
Gaussians (e.g. B aldrv et al- 2004) . The mean color and dis- 
persion of these two (red and blue) distributio ns depend on 
luminosity, but little on galaxy environment ("Blanton et alj 
2003; Balogh et al. 2004; H022 et al. 2004). This bimodal- 
ity extends to moderate redshifts, z 1-5 (e.g.. Bel l et all 
^00 3. ..2004bj Will mer et al. 2005; iFaber et al. 2005j) , and 
there exists a population of massive, very red galaxies at 
even higher redshift (e.g. 'Fra nx et ani2003l) . The red galax- 
ies in this bimodal distribution are almost all elliptical, 
absorption-l ine gala xies, at le ast at redshi fts z 5, 1 (e.g., 
^ateva et al. 2001; .Bernardi et al.. ,2003c: Bell et a l.ll2004al: 
Ball et al. 2005), which appear to be passively evolving from 
a redshift of peak star formation z ^ 1 .5 - 2.5, according 
to both fundame ntal p l ane (e.g.. [ van D okkum et al. 200 jl 
"Treu et al. 2001, 2002; Gebh ardt et aTri2003t IWuvts et all 
2004; van de Ven et al. 2003), and color and spectral anal- 
yses (e.g., Menanteau et al. 2001; Kuntschner et al. 2003 
iTreu et al.l2002^.van de Ven et al..2003;.Bell et al...2004b.) . It 
also appears that the properties of the red galaxies and their z = 
distribution, as well as their clustering and mass density evo- 
lution, are consistent with their being form ed through merg- 
ers and thereafter relaxing quiescently (e.g.. lKauffmann et alJ 
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mM iBudavari et al.1 l2()()3t iBell et al.ll2()()3t iBaldrv et al.1 
20n4HWeineretal.l2n05h. 

For mergers to produce red ellipticals from blue, star- 
forming disks and yield a bimodal color distribution, the color 
must evolve rapidly, or the observed bimodality would be 
washed out, requiring that star formation be terminated soon 
after a merger. Sprinael et al. (2005a) showed that this will 
not occur, especially in gas rich mergers at high redshift, 
if black hole feedback is neglected, because even a small 
amount of cold gas remaining after a powerful starburst will 
fuel a low level of star formation for a Hubble time (e.g. 
iMihos & Hernquist 1994, 1996; Hernquist & Mihos 1995), 
preventing the remnant from reddening sufficiently. However, 
l^prinael et al. ( 2005a) demonstrated that feedback from black 
hole growth and quasar activity caused by mergers can re- 
sult in a much more violent and abrupt expulsion and heating 
of the remaining gas, as the black hole nears its final mass. 
This process also produces a remnant that satisfies observed 
correlations between b lack hole and host galaxy properties 
tPi Matteo et alJ2005l) . 

Observations of elliptical galaxy ages and star forma- 
tion histories motivate the notion of "anti-hierarchical" 
growth, or "cosmic downsizing," (e.g., 'Bower et a l.l Il992t 
van Dokkum & Franx 1996; Ellis et al. 1997; Bernar di et al. | 
1998; J0rgense n et al.i il996t iBell et al. 2004b; Fab er et alJ 
2005), where the most massive spheroids are also the oldest 
and reddest systems. While black hole feedback is likely a 
key ingredient in shutting down star formation in these sys- 
tems at high redshifts, allowing them to redden onto the ob- 
served z = color-magnitude relation, it does not automati- 
cally imply that particular black hole and galaxy formation 
scenarios are self-consistent. Moreover, although there is ev- 
idence of downsizing in quasar activity, with the most lumi- 
nous quasars active at z ^ 2 and the peak formation redshift of 
quasars evolving as a function of luminosity (e.g. Page et al.' 
[1297; Mivaii et al. 2000; Cowie et al. 2003; Ueda et al. 2003; 
iHasinger. Mivaii. & S chmidt 2005; La Franca et alJl2005ll) . it 
has not been demonstrated that the implied downsizing is con- 
sistent or even quantitatively similar to that of the galaxy pop- 
ulation. As we demonstrate in what follows, the relationship 
between downsizing in galaxy and quasar populations de- 
pends sensitively on the model chosen for quasar light curves 
and lifetimes in any scenario in which spheroids and quasars 
form together. 

In our picture, red, remnant spheroids and supermassive 
black holes are produced simultaneously in galaxy mergers 
which also yield starbursts and quasar activity. Previously, 
we studied black hole evolution in mergers using simula- 
tions (Hopkins et al. 2005a-e), and showed that the com- 
plex, luminosity-dependent quasar lifetimes and obscuration 
(jHopkins et al. 2005a b) lead t o a new interpret ation of the 
quasar luminosity function (Hopkins et al. 2005c), where the 
faint end of the luminosity function consists mainly of quasars 
growing to much larger final masses or in declining states fol- 
lowing peak quasar activity. This implies that the distribution 
of quasars being created at a given redshift as a function of the 
quasar peak luminosity or final black hole mass is peaked at a 
luminosity (mass) corresponding to the observed break in the 
luminosity function, falling off towards brighter and fainter 
luminosities. This differs from all previous models of quasar 
lifetimes, which predict that this distribution should have es- 
sentially identical shape to the observed luminosity function, 
increasing monotonically with decreasing luminosity (black 
hole mass). Because our simulations also yield observed cor- 



relations between black hole and remnant host galaxy proper- 
ties, we can deduce the distribution and evolution of the rem- 
nant red galaxies produced in these merger events. These pre- 
dictions will necessarily differ than those based on idealized 
models of quasar lifetimes, which yield a qualitatively differ- 
ent distribution of black hole masses (and thus host galaxy 
masses and velocity distributions) being formed at any given 
redshift. 

Here, we use our models of quasar lifetimes and lightcurves 
and the observed quasar luminosity function to determine the 
rate at which quasars with a given peak luminosity or final 
black hole mass are born in mergers. Using the scaling rela- 
tions between black hole and host galaxy properties derived 
from these simulations, we determine the birthrate of rem- 
nants with given properties as a function of redshift, and use 
this to predict the properties and evolution of the red, ellipti- 
cal population in various wavebands. In § |2|w e describe our 
methodology, including the simulations (§ 12. 1> . our mo del o f 
quasar lifetimes and the quasar luminosity function (§ \2.2l . 
and the black hole -hos t gala?cy scaling relations obtained from 
the simulations (§ 12. 3t . In §|3lwe use this information to pre- 
dict the distribution of galaxy velocity dispersions with red- 
shift, as well as the galaxy mass function and its evolution. In 
§ |4] we obtain the galaxy luminosity function and its evolu- 
tion in many observed wavebands and for redshifts z = 0-6. 
In §|5]we predict the distribution of galaxy colors as a func- 
tion of magnitude in several bands, velocity dispersion, and 
redshift. In §|6lwe estimate the distribution of mass-to-light 
ratios and luminosity-size relation, and their differential evo- 
lution with time, as a function of mass and redshift. In § 
we predict the distribution of formation ages (redshifts) as a 
function of galaxy mass, velocity dispersion, and luminosity. 
Finally in §|8]we discuss our results and their implications for 
observations and models of the joint formation of spheroids 
and active galactic nuclei (AGN). 

Throughout, we adopt a VLm = 0.3, JIa = 0.7, i/o = 
70kms"' Mpc"' cosmology. Unless otherwise stated, all mag- 
nitudes are in the Vega system. 

2. METHODOLOGY 

2.1. The Simulations 

The simulations were performed using GADGET-2 
( Springel 2005 ), a new version of the parall el TreeSPH code 
GADGET dSpringel. Yoshida. & Whi te'200lV GADGET-2 em- 
ploys a fully conservative formulation ( Springel & HernauisJ 
2002) of smoothed particle hydrodynamics (SPH), which 
maintains simultaneous energy and entropy conservation 
even when smoothing lengths evolve (see e.g., Hernquist 
1993b, O'Shea et al. 2005). Our simulations account 
for radiative cooling, heating by a UV background (as in 
Katz et al. 1996, Dave et al. 1999), and incorporate a 
sub-resolution model of a multiphase interstellar medium 
ISM) to describe star form ation and supernova feedback 
Spri ngel & Hernauislll2003ai) . Feedback from supernovae 
is captured in this sub-resolution model through an effective 
equation of state for star-forming gas, enabling us to evolve 
disks with large gas fractions so that they are stable against 
fragmentation (see, e.g. Springel et al. 2005b; Springel & 
Hernquist 2005; Robertson et al. 2004, 2005a). 

Supermassive black holes (BHs) are represented by "sink" 
particles that accrete gas at a rate M estimated from the lo- 
cal gas density and sound speed using an Eddington-limited 
prescription based on Bondi-Hoyle-Lyttleton accretion theory 
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(Bondi 1952; Bondi & Hoyle 1944; Hoyle & Lyttleton 1939). 
The bolometric luminosity of the black hole is Lboi = erMc^, 
where e^- = 0. 1 is the radiative efficiency. We assume that a 
small fraction (typical « 5%) of Lboi couples dynamically to 
the surrounding gas, and that this feedback is injected into 
the gas as thermal energy. This fr action i s a fre e parameter, 
which we determine as in Di Matte o et al.l(l2005h by matching 
the observed normalization of the Mbh - cr relation. For now, 
we do not resolve the small-scale dynamics of the gas directly 
around the black hole, but assume that the time-averaged ac- 
cretion rate can be estimated on the scale of our spatial reso- 
lution (reaching w 20 pc, in the best cases). 

The progenitor galaxies are constructed as described in 
|Sprin2el et al. (2005b). For each simulation, we generate two 
stable, isolated spiral galaxies, with dark matter halos having 
a .Hernaui st ( 1990) profile, motivated by cosmological simu- 
lations (e.g. Navarro et al. 1996; Busha et al. 2004), simple 
analytical arguments (e.g. Jaffe 1987; White 1987; see Barnes 
1998, §7.3), and observations (e.g. Rines et al. 2002, 2002, 
2003, 2004), an exponential disk of gas and stars, and (option- 
ally) a bulge. The galaxies have masses Mvh- = V^^JilOGHo) 
for z = 0, with the baryonic disk having a mass fraction 
OTd = 0.041, the bulge (when present) has nih = 0.0136, and 
the rest of the mass is in dark matter typically with a con- 
centration parameter 9.0. The disk scale-length is computed 
based on an assumed spin parameter A = 0.033, chosen to be 
near the mode in the observed A distribution ( Vitvitska et al. 
I2002h . and the scale-length of the bulge is set to 0.2 times the 
resulting value. 

Typically, each galaxy is initially composed of 168000 dark 
matter halo particles, 8000 bulge particles (when present), 
24000 gas and 24000 stellar disk particles, and one BH parti- 
cle. We vary the numerical resolution, with many of our sim- 
ulations using instead twice as many particles in each galaxy, 
and a subset of simulations with up to 128 times as many par- 
ticles. We vary the initial seed mass of the black hole to iden- 
tify any systematic dependence of our results on this choice. 
In most cases, we choose the seed mass either in accord with 
the observed Mbh-c relation or to be sufficiently small that its 
presence will not have an immediate effect. Given the particle 
numbers employed, the dark matter, gas, and star particles are 
all of roughly equal mass, and central cusps in the dark matter 
and bulge profiles are reasonably well resolved (see Fig 2. in 
Springel et al. 2005b). 

The form of our fitted quasar lifetimes and galaxy scal- 
ing relations are based on a series of several hundred 
merger simulati ons, de scribed in Robertson et al. (2005b) and 
iHopkins et al.. (.2005el) . We vary the resolution, the orbital ge- 
ometry, the masses and structural properties of the merging 
galaxies, the mass ratio of the galaxies, initial gas fractions, 
halo concentrations, the parameters describing star formation 
and feedback from supernovae and black hole growth, and 
initial black hole masses. The progenitors have virial veloc- 
ities Kir = 80, 113, 160, 226, 320, and 500 kms"', constructed 
to resemble galaxies at redshifts z = 0,2, 3, and 6, and span a 
range in final black hole mass Mbh ^ 10^- 1O'"M0. This 
large set of runs allows us to investigate merger evolution 
for a wide range of galaxy properties and to identify any 
systematic dependence of our modeling. Moreover, the ex- 
tensive range of conditions probed gives us a large dynamic 
range in our simulations, with final spheroid masses spanning 
Msph ~ 10**- IO'-'Mq, covering nearly the entire observed 
range. 



2.2. Quasar Lifetimes and the Quasar Luminosity Function 

Previous theoretical studies of the quasar luminosity 
function have generally employed idealized quasar light 
curves, either some variant of a "feast or famine" or 
"light bulb" model (in which quasars have only two states: 
"on" or "off", with constant luminosity in the "on" state; 
e.g.. Small & Blandford 1992; Kauffmann & Haehnelt 200QI 
Baiman & Menou 2000; Haiman. Ouataert. & Bower 2004) 
or a pure exponential light curve (c onstant Eddington-ratio 
growth or exp onenti al decav: e.g.. iHaiman & Loebl Il998t 
IVolo nterietaLll2003t IWvithe & Loebll2003l) . However, our 
simulations of galaxy mergers suggest that these models are 
a poor approximation to the quasar lifetime at any given lu- 
minosity. The light curves from the simulations are complex, 
generally having periods of rapid accretion after "first pas- 
sage" of the galaxies, followed by an extended quiescent pe- 
riod, then a transition to a peak, highly luminous quasar phase, 
and then a dimming as self-regulated mechanisms expel gas 
from the remnant center after the black hole reaches a critical 
mass. In addition, the accretion rate at any time can be vari- 
able over small timescales ~ Myr, but despite these complex- 
ities, the statistical nature of the light curve can be described 
by simple forms, which we describe below. 

From the simulations, we find that the differential quasar 
lifetime, i.e. the time spent by a quasar in a merger in a given 
logarithmic luminosity interval, is well fitted by an exponen- 
tial, 

df/dlogL = /*exp[-L/L*], (1) 

where L*q is proportional to the peak quasar luminosity (Lpeak; 
roughly, the Eddington luminosity of the final black hole 
mass), and fg is weakly dependent on peak luminosity. When 
quantified as a function of Lpeak in this manner, the quasar 
lifetime shows no systematic dependence on any host galaxy 
properties, merger parameters, initial black hole masses, ISM 
and gas equations of state and star formation models, or any 
other varied parameters (Hopkins et al. 2005e). 

If quasars of a given peak luminosity are being created or 
activated at a rate n(Lpeak) at some redshift z, then, to first 
order in the quasar lifetime over the Hubble time, the observed 
quasar luminosity function (neglecting attenuation) is 



d* 
dlogL 



^'^^f '^Jrf'' «(^peak) d log(Lpeak) • (2) 

dlog(L) 



Knowing the quasar lifetime, we can invert this relation to 
determine the birthrate of quasars as a function of peak lu- 
minosity and redshift, n(Lpeak)- As shown in Hopkins et al] 
( I2005et) . the quasar luminosity functions in optical, UV, soft 
X-ray, and hard X-ray wavebands (including the effects of 
extinction) and at all measured redshifts are simultaneously 
well-fitted by a lognormal n(Lpeak), 



n(ipeak) = n* 



1 



(T*\/27r 



exp 



]_ /log(Lpeak/i*)^(2 
2 V cr* 



(3) 



Here, is the total number of quasars being created or acti- 
vated per unit comoving volume per unit time; is the me- 
dian of the lognormal, the characteristic peak luminosity of 
quasars activating (i.e. the peak luminosity at which n(Lpeak) 
itself peaks), which is directly related to the break luminosity 
in the observed luminosity function (Hopkins et al. 2005c); 
and (T* is the width of the lognormal in n(Lpeak)^ which deter- 
mines the slope of the bright end of the luminosity function. 
The evolution of the quasar luminosity function with redshift 
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Fig. 1. — Predicted luminosity function (left, solid line) at z = 1 us- 
ing our model for quasar lifetimes and evolution. The corresponding 
hard X-ray luminosity function of Ueda et al. (2003) (circles) is shown 
for comparison, rescaled to bolometric luminosity following Hopkins et al.. 
j2005e); Marconi et al. ( 2004); Vianali et al. 1 2003). The ri(Lpeak) distribu- 
tion (rescaled in arbitrary units for comparison) is shown (dashed line), as 
is the ri(Lpeak) distribution obtained using a "light-bulb" or exponential light 
curve model of the quasar lifetime (dotted). On the right, the coirespond- 
ing rate of formation of black holes/quasars of a given final mass, ^(Mbh) 
is shown (dashed), as well as the rate of formation of remnant spheroids 
of a given virial (dotted; fi(Mvi,)) and stellar (solid; ri(Msp\i)) mass, deter- 
mined from the Mbh ~ ^vir and fundamental plane relations of our simula- 
tions (Robertson et al. 2005b). 



is well-described by pure peak-luminosity evolution, where 
and cr* are constant but L» = L° exp(A:i r). Here, r is the 

fractional lookback time t = Ho f dt. Above z ~ 2-3, the 
quasar population declines, but the detailed shape and evolu- 
tion of the faint-end of the quasar luminosity function at these 
redshifts is poorly constrained from observations. Therefore 
we consider two choices: either pure peak-luminosity evolu- 
tion (PPLE), where we multiply L» by a factor exp(-A:2 [z - 2]) 
for z > 2, or pure density evolution (PDE), where we multiply 
the z = 2 luminosity function by a normalization factor (i.e. 
multiply n, by a factor) with identi cal functional form. 

We follow Hookins et al. '(■2005?), but fit to the more recent 
lumi nosity functions in the hard X- ray, soft X-ray , and optica l 
from'Ueda et al. (2003); Hasinaer, Mivaii. & Schmidt (2005): 
iRichards et al. ( 2005), respectively, and find the best-fit 
parameters (log(L°/L0), ^i, ^2, log(n*/ Mpc'-'Myr"'), cr*) = 
(11.3,4.0,0.65,-6.37,0.7). These are similar to the values 
given by Hopkins et al. (2005e) using older observations, al- 
though they suggest a somewhat narrower width in peak lu- 
minosities (with the peak more closely related to the break in 
the observed luminosity function). 

From the form of the quasar light curve and lifetime as a 
function of luminosity, we can calculate the final black hole 
mass for a given Lpeak, and convert from n(Lpeak) to «(Mbh), 
the birthrate of black holes of a given final (post-merger) 
mass. Accounting for the corrections owing to the non-trivial 
shape of the quasar light curve and lifetime, we obtain 

Mbh =MEdd(ipeak) [1.24(Lpeak/lO''L0)-'' "]. (4) 

Applying this conversion to our fitted n(Lpeak), we 
find that niM^n) is also a lognormal, with iden- 
tical redshift evolution and functional form, and 
(log(M%jj/Mo),^i,^2,log(n*,BH/ Mpc-3Myr-i),cr,,BH) = 
(6.45,3.2,0.59,-6.25,0.62). 



Figure [2 shows an example of the results of our procedure 
for deconvolving an observed quasar lum inosity function to 
obtain the black hole birthrate, using the lUeda et al] (l2003h 
hard X-ray luminosity function at z ~ 1 . The left panel gives 
the quasar luminosity function, where the black points are the 
observations and the line is the prediction from the quasar 
lifetimes and fitted «(Mbh) above. The right panel shows 
the corresponding n(MBH) distribution at this redshift. The 
n(i'peak) ["(Mbh)] distribution derived has the property that it 
peaks at a characteristic peak luminosity (black hole mass) 
corresponding to the break in the observed luminosity func- 
tion, and falls off to both lower and higher luminosities. In 
this interpretation of the quasar luminosity function, "cosmic 
downsizing" follows naturally as the break in the quasar lumi- 
nosity function moves to lower luminosities at lower redshifts, 
and the implied downsizing is indeed quantitatively more dra- 
matic than that implied by idealized models of quasar activity 
(see §|6land Figure 23 of Hopkins et al. 2005e). 

It is important to note that the slope of the faint (Iow-Mbh) 
end of n(MBH) is only weakly constrained by the observed 
luminosity function, a point discussed further in § 14.11 To 
illustrate this, the figure shows the birthrate of quasars of a 
given peak luminosity, n(Lpeak) (plotted in arbitrary units to 
demonstrate this qualitative behavior) as the dashed line. The 
n(Lpeak) distribution which would be obtained using a "light- 
bulb" or exponential light curve model of the quasar life- 
time is also shown (dotted line) for comparison (the n(MBH), 
«(Mvir), /i(Msph) distributions for such a model will have the 
same shape as the «(Lpeak) distribution, as explained below in 
§ 12. 3> . The two models make very different predictions for 
luminosities/masses below those corresponding to the break 
in the observed luminosity function. 

Although we do not consider the brief active quasar and 
starburst phases in our subsequent analysis (as they are heav- 
ily affected by rapidly evolving star formation, dust obscura- 
tion, merger dynamics, and quasar luminosities), we note that 
our modeling allows us to predict the behavior of the active 
quasar host galaxy luminosity function. We expect that the 
active quasars at a given redshift should have a narrow range 
in peak luminosities (black hole masses), corresponding to a 
narrow range in host galaxy stellar masses. This is shown in 
the n(Lp^ak) and «(Msph) (derived below) distributions given in 
Figure^ We therefore expect that the host galaxies of quasars 
active at a given time will have a much narrower range in lu- 
minosities than that predicted by e.g. idealized models of the 
quasar lifetime (for which «(Lpeak) and therefore «'(Msph) must 
increase monotonically with decreasing luminosity/mass; see 
e.g. Lidz et al. 2005). There is observational support for this: 
the quasar host galaxy luminosity function is found to follow 
an approximately lognormal distribution with narrow width 
o-iog(L,host) ^ o-iog(M,ho.so = 0.2 0.6-0.7 magnitudes) and a 
peak roughly corresponding to the stellar mass of quasar hosts 
with Li„ait ~ the quasar luminosity function break luminos- 
ity (Bahcal l'etanil997t iMcLure et alJ[T999t iHamilton et"al] 
12002.) . 

2.3. Scaling Relations Among Galaxy and Black Hole 
Properties 

Self-regulated black hole growth in our simulations yields 
a black hole mass-bulge velocity dispersion (Mbh - o") rela- 
tion (Di Matteo et al. 2005) which agree s well with the ob- 
servations of, e.g. Gebhardt et al. (2000); Ferrarese & MerntJ 
(.2000) ; Tre maine et al. (2002) . Robertson et al. ( 2005b ) fur- 
ther study this relation, and find it holds for mergers occurring 
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at any redshift, with a constant slope and weak evolution in the 
normaUzation. From the simulations, they find 



log(MBH/Mo)w8.1+4.01og( 



200 km s- 



-0.191og(l+z). 

(5) 

The precise values depend on the fitting method, bu t in all 
cases agree well with those determined in Tremai ne et alJ 
(|200?) for z = 0. The scatter about this relation from the simu- 
lations is ^ 0.3 dex, similar to that observed. It is also impor- 
tant to note that the evolution seen in these simulations pro- 
duces a z = Q scatter consistent with what is observed, which 
is not the case for all theoretical models ( Robertso n et alJ 
I2005bl) . 

The weak evolution in the Mbh - c relation is caused by 
an increasing a for a given stellar mass with increasing red- 
shift, as halos at higher redshift are more compact; the relation 
between black hole mass and total stellar mass (Mbh-A^sph) 
is independent of redshift. This independence is also sug- 
gested observationally by galaxy-AGN clustering properties 
as a function of redshift ( Ade lberger & Steidel 2005). From 
our simulations, we can similarly determine the Mbh-A^vii- 
and Mbh - A^sph relationship, giving 



Mr 



: 7.0 X 10-'*Mv 



Mbh = 0.001 M,ph, 



(6) 
(7) 

in reasonable agreem ent with the observations of 
|m arconi J2003h . if we account for the slightly 

different definitions of Myir used. Here, Mvii is the virial 
mass within an effective radius, alternatively defined by 
Mvir = ka^Re/G, where to be definite we take a to be the 
average spheroid velocity dispersion within the effective 
radius R^. For this conversion (where necessary) we adopt 
^ = 5, as is roughly seen in our simulations and expected for 
e.g. a Hernquist (1990) spheroid or /?'/^-law profile, and also 
similar to that suggested by comparison of mass measure- 
ments from dynamical modeling and from measurements 
of (7 and Re (e.g. Cappellari et al. 2005, although compare 
Marconi & Hunt 2003, who adopt ^ = 3, which is the primary 
reason for the small discrepancy in the relation they observe 
and those we show above). The scatter about this relation 
from the simulations is small, about 0.3 dex, similar to that 
observed. 

We note that there are considerable observational con- 
tradictions regarding possible evolution in the Mbh - cr or 
Mbh-Msdh relations, with e.g. Shields et al. (2003) and 
[Adelberger & Steidel ( 2005) finding no evolution to z ~ 3 and 
e.g. Pen g et al. (2005) and Mc Lure et a l. (2005) finding sub- 
stantial evolution at z < 2 (specifically, substantially under- 
massive bulges at z '--^ 2). However, these observations are 
still difficult and have large uncertainties; furthermore, they 
specifically select primarily active, h igh Edd ington ratio ob- 
jects, which local observations (e.g.. i Barth et al. 200 5) sug- 
gest may be biased to lie above the Mbh - f relation in the 
manner observed. Above z ^ 2, the possibility for such evolu- 
tion, and the uncertainty resulting from it, is essentially cap- 
tured in our consideration of pure luminosity vs. pure den- 
sity evolution for the quasar luminosity function, since these 
different evolutions imply a different peak luminosity (i.e. fi- 
nal spheroid mass) distribution. Thus, the uncertainties in- 
troduced by such evolution are not significantly larger than 
those we already describe, unless there is large evolution at 
< z < 2. Even such evolution in the Mbh -Mjph relation will 
not change many of our conclusions, if the stellar mass of the 



final spheroid is primarily formed at this time, but is simply 
assembled (presumably in subsequent dry mergers) at later 
times. The alternative, that this stellar mass is formed between 
z = 2 (when the massive black holes were formed) and z = 0, is 
ruled out strongly by many observations which show the host 
spheroids of these black holes have old stellar populations 
with r edshifts of forrnation z ^ 1.5-2.5 ("e.g. iBower et all 
'1992* lj0rgensen et al.' '1996', 'van Dokkum & Franx' '199e" 
ElHsetal. 1997; Bernardi et al. 1998; J0rgensen et al. 199S 
van de Ven et al. 2003; Cross et al. 2004; Wuvts et al. 2004 
Bell et al. 2004b ; Forster Schreiber et al. 2004 ; Labbe et alJ 
l2005h. 



Robertson et al. (2005c, in preparation) employ our sim- 
ulations to study the fundamental plane relation between 
spheroid effective radius R^, velocity dispersion a, and stellar 
surface mass density E of the merger remnants. In this, the 
projected stellar surface density S is calculated along many 
different lines-of-sight, and for each, R^ is determined as the 
two-dimensional radius enclosing half the stellar mass, and 
a is the mass-weighted line-of-sight stellar velocity disper- 
sion within an aperture of radius R^. When compared to e.g. 
the observed K-band fundamental place, for which a constant 
mass-to-light ratio is a reasonable approximation, the rem- 
nant spheroids of gas-rich mergers from our simulations fall 
on the observed infrared fundamental plane {R,, cx cr' ^^S"" ™, 
e.g., Pahre et al. 1998a b) with little scatter. This relation and 
direct measurement yields a stellar mass-effective radius rela- 
tion of the form R^ cx Msph(^f.)'^ (where Ms^hiRe) is the stellar 
mass within the effective radius), or 

log(7?,/kpc) = a + /3 log(Msph(^.)/M0). (8) 

The average Msph(/?e) - Re relation found in our simula- 
tions has best-fit coefficients a = -5.81, /3 = 0.57, (i.e. Re w 
0. 86 kpc ( M,nh (R,) 1 1 0'° M(7,) °-^^), in good agreement with ob- 
servations jShen et a DHool after accounting for the small 
difference between effective radius used here and half-light 
radius observed. The exact relation has a weak depen- 
dence on redshift, which we include; but we find little dif- 
ference in our results in either case as, for example, at z = 
0, a = -5.6, 13 = 0.56, and at z = 2, a = -5.4, (3 = 0.53. 
Observat ions also suggest only weak evolution in this rela- 
tion (e.g. Truiillo & Aguerrill200llTruiillo et"ai]l200i l200l 
Mcintosh et al 2005^! 

We use this relation to convert between mass-to-light ratios 
as a function of stellar mass to a luminosity-size relation in 
§ |5J but we can also use it to determine the spheroid stellar 
mass as a function of virial (dynamical) mass and black hole 
mass. Combining the equations above, 

Msph(i?.) . . / M, 



Mv 



iO.3 



(lOi"M, 



-0.2 



(9) 



This agrees well with observations (e.g. Bernardi et al. 2003a, 
Padmanabhan et al. 2004; Cappellari et al. 2005) and addi- 
tionally follows from the observed Msph - Re relation given 
that Mbh oc cr"* cx Mvir. Note that we have defined the stel- 
lar mass Msph(/?f) as that within the effective radius Re\ this 
means that the total galaxy stellar mass is Msph ~ 2 Msph (^c). 
These relations are determined from the simulations to be 
independent of redshift (except for the weak evolution in 
Msph-/?(. which we account for). When only the total stellar 
mass is needed, we use the directly fitted Mbh -Msph relation 
described above as it both avoids the uncertainties inherent 
in these conversions and accounts for e.g. changing bulge-to- 
disk ratios as a function of mass. 
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In what follows, we are not concerned with the structure of 
individual galaxies, and so defer a detailed structural analysis 
of merger remnants (Robertson et al. 2005c, in preparation). 
We instead use the relations above to study the statistical prop- 
erties (i.e. conditional age and mass distributions) and evolu- 
tion of the red galaxy population. We emphasize that although 
we use the form of these relations from our simulations, be- 
cause each agrees well with its observed counterpart, it makes 
no difference to our results whether we use the relations from 
our simulations or adopt the observed scalings. 

The simulations yield relationships between black hole 
mass and either velocity dispersion or stellar spheroid mass 
that can be used to transform the birthrate of black holes of 
a given final mass Mbh, «(Mbh) into a birthrate of remnants 
with definite velocity dispersion n(cr) or stellar spheroid mass 
n(Msph). This is illustrated in Figure [2 where the right panel 
gives the n(Mvh) (dotted) and n(Msph) (solid) relations derived 
using the fitted relations above, our modeling of quasar life- 
times, and the observed quasar luminosity function. Although 
there are several steps in this procedure, we emphasize that 
all of the relationships used, each agreeing with observations, 
are determined entirely from the simulations alone, in a self- 
consistent manner Any additional modeling required beyond 
this point is further calculated self-consistently from the sim- 
ulations and is directly constrained by observations of quasars 
(e.g. the cases of obscuration and quasar lifetimes; Hopkins et 
al. 2005e) or galaxies (e.g. star formation and stellar popula- 
tion synthesis models; Bruzual & Chariot 2003). The lone ob- 
servational input is the observed quasar luminosity function, 
from which we derive the birthrate of spheroids of a given 
mass (or velocity dispersion). 

In our simulations, merger remnants resemble elliptical 
galaxies, with small gas fractions, and star formation is ter- 
minated by feedback as the black hole reaches its final mass. 
Thereafter, the galaxies mainly evolve passively, without sig- 
nificant star formation. The timescale for the merger-induced 
starburst is ^ 100 Myr (e.g. Springel et al. 2005a), much 
shorter than the merger timescale ^Gyr. We therefore adopt 
the approximation that the merger occurs instantly at the red- 
shift being considered, and that the remnant does not evolve 
after that point (at least to very high redshifts where the Hub- 
ble time becomes comparable to the timescale for the merger). 
We have actually considered two cases: one where we assume 
each spheroid is formed instantly at the redshift under consid- 
eration, and a second where we assume the starburst to have 
a Gaussian shape in time with a peak at z and characteristic 
falloff timescale (standard deviation) ~ 100 Myr. We find es- 
sentially no difference in our predictions between these cases, 
except for a slight reddening of typical galaxy colors at high 
redshift in the latter case. We also do explicitly calculate the 
possible consequences of subsequent "dry merging" in § 14.11 
below, and show that they are small. 

Given the birthrate of spheroids, we use the stellar popula- 
tion synthesis models of Bruzual & Chariot (2003) to deter- 
mine their observed luminosities and colors. The remnants in 
our simulations typically have solar metallicities, even at high 
redshift, (as expected from observations of high-reds hift red 
galaxies, e.g. van Dokkum et al. 2004; Forster Schreib er et al.l 
1^04) as metal enrichment occurs through star formation and 
associated supernova feedback in the most dense regions of 
the galaxy and metals a re distributed throughout the galaxy 
by quasar feedback (iCox et al . 2005a). To examine the im- 
pact of the metallicity on the stellar population, we consider 
two cases: one in which the remnants are assumed to have so- 



lar metallicity (0.02) and the second where they have a Gaus- 
sian metallicity distribution (with a mean solar metallicity) 
and standard deviation ^ 0.005-0.01. We find little differ- 
ence in our results between these two cases. 

A scaling of metallicity with mass or velocity dispersion 
a could also influence our predicted luminosity functions. 
There is some observational evidence of a correlation be- 
tween metallicity and a (e.g.' Worthev etal] IT99?. 'J0rgensen| 
1997; Kuntschner 2000), but the inferred metallicities are de- 
generate with the modeled population ages ( Worthe v et all 
il99 5; Faber et al- 199 5; Worthev 1997) and some studies in- 
fer no connection b etween meta llicity and either velocity dis- 
persion or age (e.g. lBernardi"et al.i.20033. 12005ft or find that 
the observed scaling of Mg and H/3 line profiles is consis- 
tent with more massive ellipticals having formed earlier (e.g. 
■Fisher e t al. 1995, 1996). Moreover, the analysis of the joint 
correlation of metalli city with age a nd vel ocity dispersion of 
|j0rgenser] fl999.) ; .J0rgensen et alJ J 19991) indicates that the 
relation between typical age and a implies very little net 
change in metallicity in observed populations. Also, it is the 
Mg2 and H/3 line indices which are well-correlated with ve- 
locity dispersion (Burstein et al. 1988; Worthev et al. 1992^ 
'Blake slee et al.l2001) : the (Fe) index shows only weak corre- 
lation with velocity dispersion (J0rgensen 1997; Trageret aQ 
119981) and so it is not clear whether this is a result of an 
enhancement of a elements or depressed Fe, and therefore 
it is difficult to translate to metallicity. Regardless of these 
uncertainties, the effect is considerably smaller than that of 
changing mean spheroid ages with mass (as demonstrated 
in § Inland §0below), as e.g. even for the extreme case of 
the evolution reported by Kuntschner (2000), with [Fe/H] = 
0.56 log(f7/100kms"'), this results in only a change from 
Z - O.8Z0 at Msph 5 X 10^ Mq to Z ~ 2.2Zq at M^ph 
2 X 10^ Mq, ultimately shifting e.g. the z = B-band galaxy 
luminosity function by only 0.1 magnitude. 

Because these effects are weak compared to the age ef- 
fects in the stellar populations we model, we do not impose 
a scaling of metallicity with mass or velocity dispersion, de- 
ferring a treatment of the chemical enrichment histories of 
galaxies to future wo rk (but see, e.g. Brook et a l. 2004a ij 
iRobertson et alJl2005d-.Font et al,2005,.) . but note that its ad- 
dition does not create any conflict between our predictions and 
observations. However, these relatively small scalings could 
be important for the observed colors, so we do briefly con- 
sider the possible effects of changing metallicity with a in 
§|5j where we show that the effect is small. We do not include 
the effects of dust reddening on the galaxy population, as our 
simulations show a dramatic and rapid falloff in characteris- 
tic column densities after the starburst, when the black hole 
expels surrounding gas as it reaches its final mass. This is 
consistent also with observations that show that only a small 
fraction < 10% of the luminosity in red galaxies can come 
from dusty, intrinsically bluer sources llBell et alJ2004ah . 

3. THE RELIC VELOCITY DISPERSION DISTRIBUTION AND MASS 
FUNCTIONS 

In § 12. H and § l2.3l we have determined niM^n), the rate at 
which quasars of a given final black hole mass are formed in 
mergers, and fit this to an analytical form. Having also de- 
termined the Mbh - o" relation as a function of redshift and its 
intrinsic dispersion from our simulations, we can then convert 
«(Mbh) to liia), the birthrate of spheroids of a given velocity 
dispersion as a function of redshift. To do so, we account for 
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Fig. 2. — The relic distribution of velocity dispersions (as defined in ii l2.3l 
and as would be inferred from the Mbh ~^ relation) at z = (solid), 1 (dot- 
ted), 2 (dashed), 3 (dot-dashed), and 5 (triple-dot-dashed). The \a range 
of observations of velocity dispersions in elliptical galaxies is shown, from 
Isheth et al. 1 2003) (yellow shaded region), with the contribution from bulges 
in SO and spiral galaxies from ..Aller & Richstone. 1.2Q0Z1 (orange shaded re- 
gion). 



the intrinsic dispersion of the Mbh - o" relation, by inverting 

/•oo 

n(MBH)= / P(MBHk)«'(^T)da, (10) 
Jo 

where we assume that P(M^\\\a) is distributed as a lognormal 
about the value given by the Mbh - c relation, with a disper- 
sion equal to that in our determined (and the observed) rela- 
tion, ~ 0.3 dex. With our modeling of spheroid and black hole 
co-formation in a single (dominant) major merger, the inver- 
sion of Equatio n [Tol above is straightforward, as derived by 
|Yu&Lu (2004) as a method to determine the velocity distri- 
bution function at various redshifts for which direct observa- 
tions of velocity dispersions are inaccessible. Thus, knowing 
n((T), we can integrate over time (redshift) to determine the 
reUc number density of sources with a given velocity disper- 
sion, n(o') = dn/dlog(o'). 

The results of this integration to z = are shown in Figure|2l 
(thick solid line). Our theoretical estimate agrees well with 
the observed distribution of velocity dispersions found for lo- 
cal z = ellipticals by Sheth et al. (2003) (Icr range shown as 
the yellow shaded region). The contribution from spheroids 
in SO and spiral galaxies, determined by Aller & Richston^ 
l^^^*' is added to this and shown also at the low-cr end 
where it dominates (Icr range shown as orange shaded re- 
gion). We caution that our prediction at 1ow-ct is somewhat 
sensitive to the assumed faint-end slope in the birthrate of 
black holes of a given mass [n(MBH)], as these are not nec- 
essarily the products of major mergers. Our estimate is on 
the high side at the extreme large-cr tail of the distribution, 
but this is where both the observations are uncertain and our 
modeling of the quasar luminosity function and correspond- 
ing black hole mass [n(MBH)] distribution are sensitive to the 
functional form and bolometric corrections adopted. 

We can also predict the velocity dispersion function at dif- 
ferent redshifts based on our modeling, and these results are 
shown in Figure |2] We note that we have adopted the pure- 
peak luminosity evolution (PPLE) form for the evolution of 
the quasar luminosity function above z ^ 2, where the break 
and faint-end slope of the luminosity function are poorly con- 
strained. If we instead consider the pure density form of this 
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Fig. 3. — Predicted z = stellar mass function in remnant red, ellipti- 
cal galaxies (upper left). This is compared to the morphologically selected 
spheroid stellar mass function of Bell et al. 1 2003 ) (blue diamonds, where 
horizontal errors show the systematic mass uncertainty). Red dot-dashed line 
shows our prediction allowing for subsequent dry mergers. Upper right shows 
the mass function at z = (black), z = 0.5 (blue), z = 1 (cyan), z = 2 (green), 
z = 2 (orange), and z = 5 (red). Lower left shows the integrated stellar mass 
density as a function of redshift, lower right the star formation rate. The solid 
lines adopt pure peak luminosity evolution for the quasar luminosity function 
above z = 2, the dashed lines adopt pure density evolution. 



evolution, the z = 3 and z = 5 distributions peak at significantly 
higher a. 

We can perform an identical procedure, using instead the 
relations between black hole mass and host galaxy stellar 
mass to obtain the relic stellar mass function and its evolution 
with redshift. Figure |3] shows the resulting z = stellar mass 
function in remnant red, elliptical galaxies (upper left). This 
is compared to the morphologically selected spheroid stellar 
mass function of Bell et al. ( 2003 ) (blue diamonds, where hor- 
izontal errors show the systematic mass uncertainty). In all 
panels, the solid lines adopt pure peak luminosity evolution 
(PPLE) for the quasar luminosity function above z = 2, and 
the dashe d lin es are for pure density evolution (PDF), as de- 
fined in § 12.21 The agreement is good over the entire range of 
observed masses, especially considering the systematic uncer- 
tainties in the observations. As is demonstrated for the galaxy 
luminosity function in Figure|4] adopting an idealized "light- 
bulb" or pure exponential light curve model for the quasar 
lifetime will not produce the turnover and shallow slope of the 
faint end of this mass function, and will overpredict the low- 
mass end by 2 - 3 orders of magnitude. The upper right of the 
figure shows the mass function at various redshifts, the lower 
left shows the integrated stellar mass density as a function of 
redshift, and the lower right the star formation rate. The evo- 
lution of the star forma tion rate qu alitativelv agrees well with 
that estimated by, e.g. ICole etal] OoOl) . but we do not ac- 
count for the star-forming spiral population which constitutes 
a significant or even dominant fraction of the integrated star 
formation rate, and so our present results are not necessarily 
in conflict with cosmological simulations indicating that the 
total mean density of cosmic star formation peaks at z ~ 4-5 
(see, e.g. Springel & Hernquist 2003b; Hernquist & Springel 
2003; Nagamine et al. 2004a). 

Subsequent gas-poor ("dry") mergers, by definition, do not 
have a reservoir of cold gas, and as a result cannot excite 
bright quasar activity. Therefore, the empirical information 
we derive on the rate at which spheroids are born as a func- 
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tion of mass and redshift from the quasar luminosity function 
does not account for dry merging. However, we can estimate 
the potential impact of spheroid-spheroid mergers on our pre- 
dictions. Recent observations (Bell et al. 2005; van Dokkum 
I2OO5I) suggest that z = spheroids have, on average, under- 
gone ^ 0.5-1 major dry mergers since z ^ 0.7 (see also Carl- 
berg et al. 1994; Le Fevre et al. 2000; Patton et al. 2002; Con- 
selice et al. 2003, although de Propris et al. 2005 estimate a 
significantly lower value ~ 0.2). Observations and our pre- 
dictions for the birth redshifts of spheroids (see § imply 
that there should not be significant dry merging much earlier, 
as most spheroids are either recently formed or still forming 
at higher redshifts. Therefore, we can estimate the effects of 
dry merging by assuming that each spheroid has undergone 
^ 0.5 major dry mergers in its history down to z = 0. For sim- 
plicity, we assume these are equal-mass dry mergers; i.e. for 
a given interval in mass, we assume half the number of pre- 
dicted spheroids dry merge, halving their number density but 
doubling their mass. 

The resulting mass function is shown by the red dot-dashed 
line in the upper left of Figure 13 (for the' pure peak luminos- 
ity evolution case). The net resulting change, as dry merging 
increases spheroid masses but decreases the total number of 
spheroids, is generally smaller than typical uncertainties in 
our modeling (of, e.g. the functional form of n(MsH)) and the 
observations, and thus we can safely ignore the impact of dry 
merging in our subsequent analysis. This is also suggested 
by calculation of e.g. the spheroid luminosity function from 
semi-analytical models (Cirasuolo et al. 2005). The effect is 
not completely negligible, however, and we note that the dry- 
merging corrected mass function agrees very well with the 
observations (within ^ la at all masses). Because dry merg- 
ers are not constrained by our empirical approach (unlike gas- 
rich mergers which produce a signal in the quasar luminosity 
function), and the rate and impact of dry galaxy mergers is 
observationally uncertain, we do not include their effects in 
any of our other predictions, but emphasize here that they in- 
troduce a relatively small second-order effect which does not 
result in any conflict with the observations. 

4. GALAXY LUMINOSITY FUNCTIONS 
4. 1 . The B-band Luminosity Function at All Redshifts 

Unlike the relic velocity dispersion function, which is de- 
termined by the integrated history of spheroids, the evolution 
of stellar luminosities and colors makes the galaxy luminosity 
function in different wavebands dependent on the time his- 
tory of spheroid formation. Because of this, it is not implicit 
that successfully reproducing the z = black hole mass dis- 
tribution will guarantee an accurate prediction for the galaxy 
luminosity functio n at z = or higher redshifts. 

As outlined in § 12.31 we use the observed quasar luminos- 
ity function and our simulations of quasar evolution to de- 
termine the birthrate of black holes of mass Mbh, and cor- 
respondingly spheroids of stellar mass Msph, as a function of 
redshift. For a given observed redshift Zobs, we can then in- 
tegrate over z > Zobs to determine the history of the spheroids 
observed at Zobs\ i-C- for a given Zobs and Msph, the distribution 
of ages/formation times is completely determined. Knowing 
the formation history for these spheroids, we use t he ste llar 
population synthesis model of iBruzual & Charlo3 (l2003h to 
determine their observed magnitudes in any given band at Zobs- 

We show our prediction for the rest-frame B-band 
red/elliptical galaxy luminosity function at a series of ob- 
served redshifts in Figure |3 In each panel, our predicted 



B-band luminosity function for the redshift indicated in the 
upper left is shown as the thick black line. When a range of 
z is indicated in the upper left of a panel, the predicted lu- 
minosity functions at both the minimum and maximum red- 
shift of the range are given. The Ict range in the observed 
luminosity function at each redshift (or redshift range) is 
indicated as a shaded region. At z ^ - 0.1 (median z = 
0.04), the observed luminosity function of iMadgwick et all 
(2002), determined from the 2dFGRS survey, is shown in 
cyan. Atz = 0.2-0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0, and 1.0- 
1.2, the shaded region shows the observed luminosity func- 
tions from Faberetal. (2005), determined from the DEEP2 
(yeflow) and COMBO- 17 (red) surveys dBeU et al.l l2004tf: 
Willmer et al. 2005). At z = 2.0 and z = 3.0, the observations 
from Giallongo e t al. ( 2005), from the Hubble Deep Field and 
K20 surveys, are shown in green. At z = 5 there is no observed 
B-band luminosity function, but we show our prediction. 

In each case, the observed luminosity function is deter- 
mined from either morphologically-selected elliptical galax- 
ies or color-selected red galaxies (especially at high redshift 
where morphological information is not available), which as 
noted in § ^ are similar at least at low to moderate z 5, 
1-2 redshifts (e.g.'Strateva et al.'200U lBernardi et alJ2003d 
Bell et al. 2004a; Ball et al. 2005). Our predictions agree well 
with the observations, over a wide range of magnitudes and 
redshifts. We slightly overpredict the bright end of the lu- 
minosity function at high redshift, but this can be explained 
by selection effects, as we show below in § |5j because many 
of these very bright, high-redshift galaxies are quite blue (as 
they have formed only recently at these high redshifts) and 
thus would not appear in an observed red galaxy luminosity 
function (although this is also somewhat related to our slight 
overprediction of the high-cr end of the velocity dispersion 
function in Figure|2li. 

In Figure |4] we also show the predicted B-band 
red/elliptical galaxy luminosity function at each redshift us- 
ing a commonly employed, idealized model for the quasar 
lifetime (blue dashed lines). Here, we assume that a quasar 
radiates at its peak luminosity L = Lpeak for a fixed time equal 
to 10^ yr (as is often adopted, and similar to the Salpeter 
time for e-folding of an Eddington-limited black hole, ts = 
4.2 X 10^ yr), but we note that the entire class of "light- 
bulb" or exponential growth/decay models for the quasar light 
curve produces a nearly identical prediction to that shown. 
This model overpredicts the number of red/elliptical galaxies 
which should be observed at low luminosities by two orders 
of magnitude, does not reproduce the shape and curvature of 
the luminosity function, and underpredicts the bright end if 
the lifetime is chosen to be longer (e.g. the actual Salpeter 
time). The quasar lifetime in these models is a free param- 
eter, but it determines only the normalization of this curve, 
and thus no value can produce a reasonable prediction for the 
galaxy luminosity function. The reason for the failure of these 
models at low luminosity is, as mentioned above, the fact that 
they associate objects observed at low luminosities with low- 
i'peak objects, and therefore Iow-Mbh objects in small-mass 
spheroids. 

Figure |3 also shows our prediction (dot-dashed lines), for 
the mean redshift of each bin, assuming pure density evolu- 
tion (PDF) instead of pure peak luminosity evolution (PPLE) 
for the birthrate of quasars with a given peak luminosity above 
z ~ 2. Although the observed quasar luminosity function 
does not provide a good constraint on which evolution is fol- 
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Fig. 4. — Predicted B-band red/elliptical galaxy luminosity function (solid lines) at different redshifts (shown in the upper left of each panel). For panels with 
a range of redshifts shown, the lines show our prediction at the minimum and maximum redshift. Dot-dash lines show the prediction assuming pure density 
evolution instead of pure peak luminosity evolution above z 2, at the mean redshift of each bin. Shaded ranges show the 1 u range of observed luminosity 
functions from Madawick et al. i2()02, cyan), Faber et al. (2005, yellow and red), Giallongo et al. i200S, green). The blue dashed line shows the prediction 
obtained used an idealized model for the quasar lifetime in which quasars grow/decay exponentially or turn on/off as a step function. 



lowed, the difference in our subsequent calculations is usually 
minimal, and observations of the faint end of the galaxy lu- 
minosity function at moderate and high redshifts (where the 
two predictions begin to diverge) do not yet exist. However, 
if such observations of the galaxy population can be made, 
or the ages of the lowest-mass/luminosity objects at z ~ 
are measured, they can provide a powerful constraint on the 
n(i'peak) ["(A^bh), "(A^sph)] distributions (i.e. the rates at which 
spheroids and quasars of given properties form with redshift). 

4.2. The Evolution of the Luminosity Function with Redshift 

The observed galaxy luminosity funct ion is usually fit 
to a Schechter function (Schechten with normaliza- 

tion (/)*, characteristic magnitude (luminosity) M* (L*), and 
faint-end slope, a. This yields a total number density of 
galaxies $ = (p*r{a+ 1), and a total luminosity density j = 
(f)* L*T{a + 2). We can determine $ and j by integrating 
our predicted luminosity function at each redshift. However, 
observationally, it is easier to determine (j)* than as a is 
difficult to measure and a constant a is often assumed. To 
compare directly with most observations (e.g., Cohen 2002; 
[ Bell et al.. 2001: Made wick et al..2003 - Giallongo et al-2005: 
iFaber et aH 120051) . we therefore assume a constant ao = 0.5 
and calculate (j)* = ^/T(ao+ 1) and likewise calculate M* 

[L* = j/(rnm+2))]. 

Figure |5l shows (/>*, M|, and js as a function of redshift. 
Our prediction is shown as a solid black line both in a low 
redshift interval z < 1.2 (upper panels) and over the entire 



z < 6 interval (lower panels). At low redshifts (upper pan- 
els), observations from Faber et al. (2005) (COMBO- 17, red 
circles, and DEEP2, black squares), Madgwick e t alJ (120031) 
(2d F, orange diamon ds). Bell et al. (2003) (SPSS, blue x's), 
and llm et al.l (l2002h (DEEPl, green stars) are shown, with 
la errors, and at high redshifts (lower panels), observations 
from Giallongo et al. (2005) (Hubble Deep Field and K20) 
are shown (circles). 

Although we slightly overpredict (j)* (and thus js as a con- 
sequence) at z ^ 1 .2, this is related directly to our small over- 
prediction of t he b right blue end of the luminosity function 
discussed in § 14. II and, as discussed in §|5]can be explained 
by selection effects as these objects have recently formed and 
are bluer than their traditionally color-selected counterparts. 
We estimate the results of this selection effect in the upper 
panels, where the dashed lines show our prediction ignoring 
all objects which have formed (i.e. gone through their peak 
merger/quasar activity) less then 1 Gyr in the past, and thus 
have not had sufficient time to redden to the point where they 
would be recognized as red galaxies in color-selected surveys 
(this corresponds roughly to the color selection of e.g. Bell 
et al. 2004b, given our modeled metallicities and star forma- 
tion histories). The agreement at z 1 - 1.2 is significantly 
improved, suggesting that the strong increase in red galax- 
ies from z '--^ 1 to present is driven in part by continued for- 
mation and mergers associated with ongoing (though declin- 
ing) quasar activity, and in part by the reddening of spheroids 
formed in mergers at the peak of quasar activity z ^ 1-2, 
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Fig. 5. — B-band luminosity function normalization 0* (left), chai'acteristic magnitude Mg (center), and luminosity density js (right) predicted by our 
model (black lines) as a function of redshift for z = — 6. Upper panels show the z < 1 range in greater detail. Dashed lines show the prediction ignoring 
recent (< 1 Gyr) mergers. Observations are fromJFaber et alj BoOS) (COMBO- 17, red circles, and DEEP2, black squares), Madgwick fit .aL i2.0-(H) (2dF, orange 
diamonds). Bell et al. i2003) (SDSS, blue x's), and Im et al. i20()2) (DEEPl, green stars) for the low-redshift (upper) panels. Results froni lGiallongo et alj 
n.m5.) (Hubble Deep Field and K20, open black circles) at high redshift (lower panels) are also shown. 



reddening to the point where they will be recognized as red 
ellipticals by z 0. 

Because, in our picture, spheroids and quasars form to- 
gether through mergers, the quantities 4>*, M|, and js are 
directly related to the quasar luminosity function. Associat- 
ing each merger with a single quasar and spheroid, the to- 
tal number of red galaxies is given by the integrated num- 
ber of quasars produced up to the observed redshift; i.e. 
$ = ri(QSO)dt, where n(QSO) is the number density of 
quasars born per unit time per unit comoving volume. In our 
determination of the luminosity function, this is «, = constant, 
the normalization of the lognormal n(MBH) distribution. Thus 
(/)* = rijH{z)/T(a+ 1), where tn is the age of the Universe at a 
particular redshift. Note that if we adopted pure density evolu- 
tion for the quasar luminosity function above z ~ 2, n^, would 
fall off exponentially above these redshifts, and (j)* would 
drop correspondingly. Currently, the observations are insuf- 
ficient to decide which possibility is correct, but this makes it 
clear that estimating the total number of red galaxies at high 
redshift in future observations can constrain the form of the 
quasar luminosity function evolution. 

Likewise, M* is directly related to the break in the observed 
quasar luminosity function, which in turn corresponds di- 
rectly to the peak in the w( LnRak) [and corresponding n(MBH)] 
distribution (Hopkin s et al.ll200 5c). and thus gives the peak in 
the rate at which spheroids of a given stellar mass are forming 
as a function of that stellar mass. Because luminosities evolve 
with the age of the stellar population, this is not trivially re- 
lated to the M* of the galaxy population as $ is to the number 
density of quasars being formed, but the two are still critically 
related and, in general, increasing M* corresponds to moving 
the break in the observed quasar luminosity function to higher 
luminosities, and vice versa. 
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Fig. 6. — Predicted z ~ red/elliptical galaxy luminosity function 
in four wavelengths (black lines), in the manner of Figure |3 Observa- 
tions are from Budavari et al. (2005); Trever et al. i2005) in the near UV 
(NUV; yellow; upper left), Madswick et al. 1 2002) in B-band (cyan; up- 
per right), Nakamura et al. i2003) in Sloan r-band (green; lower left), and 
l&ichanek et alJ 120011) in K-band (red; lower right). 



4.3. The Luminosity Function in Different Wavebands 

Figure|6lshows our predicted red/elliptical galaxy luminos- 
ity function (solid lines) in several different wavebands at 
z 0; the near ultraviolet (NUV; at 2400 A or 0.24 /i), B-band 
(0.44 /i), r-band (0.66 /i), and K-band (2.18 /i). Each is com- 
pared to the observations (shaded regions or points showing 
Icr errors), shown over the range of magnitudes where data 
exist. The observations shown are from Budavari et al- 
and lTreveretal] (l200l in the NUV from GALEX (yellow; 
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Fig. 7. — Predicted luminosity function (black lines) at the minimum and 
maximum redshift of each redshift range shown, z. = 0.25-0.5 (left panels), 
z = 0.5-0.8 (center panels), and z = 0.8- 1.05 (right panels), in the manner of 
Figure |4| Each is compared to observations from Cohen i2002) (except the 
NUV at z = 0.8 — 1.05, where the observations are insufficient to determine a 
luminosity function), in the NUV (yellow), U (cyan), R (green), and K (red) 
bands. 



upper left).| Madgwick e t all j2002l) in B-band from 2dFGRS 
(cyan; upperright), Baldrv et al. (2004) (see also Nakamuraet 
al. 2003) in Sloan r-band from SDSS (green; lower left), and 
iKochanek et al. (2001) in K-band from 2MASS (red; lower 
right). The NUV prediction has been rescaled to AB mag- 
nitudes for ease of comparison with the observations. The 
agreement in these bands is good, implying that not only do 
we reproduce the luminosity function in a wide variety of 
wavebands, but also the color distribution as a function of 
magnitude. 

Figure extends this to higher redshift, showing the pre- 
dicted luminosity function in the NUV (yellow, top panels), 
U-band (0.36 /i; blue, second from top), R-band (green, sec- 
ond from bottom), and K-band^ (re d, bottom panels) in three 
redshift intervals, from Cohen (2002). Again, the shaded re- 
gions show the 1(7 range in the observed luminosity function 
and the solid lines show our prediction at the minimum and 
maximum redshift of each interval. Our predictions also agree 
well w ith the VIMOS lum inosity functions in U, B, V, R, and 
1 from lZucca etalJ J2005ft for the redshift range z = 0.4-0.9 
(these results compare favorably with the plotted luminosity 
functions in the center panels). 

In Figure |8] we plot the predicted luminosity function at 
redshifts z = 0.0, 0.2, 0.5, 1.0, 2.0, and 3.0, and 0*, M*^^^, 
and jBAND (the normalization, characteristic magnitude, and 
total luminosity density in each band, respectively) of each 
luminosity function (determined as in § O for redshifts 
z = 0-6. The results are shown for the bands U, B, V, R, 
I, J, H, and K, from purple to red, respectively. For (p*, 
Mliun, and y'BAND, the U, R, and K-band observations of 
[Cohen (2002) (from the luminosity functions of Figure^} are 
shown as filled circles (with colors matching those of the cor- 
responding prediction for each band). The z = 0.4-0.9 ob- 
servat ions in U, B, V, R , 1 (with the corresponding colors) 
from Zucca et al. (2005") are shown also (diamonds), as are 
the z^O observations of Nakamura et al. (2003) (r, green tri- 
angle) and Koc hanek et al.l (1200 ll) (K, red square). This pro- 
vides a large set of predictions, of the shape and integrated 
properties {(p*, M*j) of the red galaxy distribution, for future 



comparison with red or elliptical galaxy luminosity functions. 

5. THE COLOR DISTRIBUTION OF RED GALAXIES AS A 
FUNCTION OF MAGNITUDE AND REDSHIFT 

Figure|9]shows our predicted color-magnitude relations for 
several different wavebands at a series of redshifts. We plot 
the mean colors (lines and open diamonds) at each mag- 
nitude and redshift, with the rms dispersion in the color 
distribution shown as vertical error bars. We show four 
separate color-magnitude diagrams, for comparison with a 
range of observations. These are (u-r) vs. Mr (upper left), 
as observed in e.g., Baldry et al. (200 4) and Balogh et 
al. (2004), (U-V) vs. Mb (upper right: 'C ross eTalllfool 
iQiallongo et al. 2005; Mcintosh et al. 2005a), (U-B) vs. Mb 
(lower left; Willmer et al. 2005; Faber et al. 2005), and (R-K) 

vs. Mk (lowe r right ; Roche et al, 2002; Pozzetti et al 2003t 

Fontana et an i2004h . For (u-r) vs. Mr . we show th e z = 
color-magnitude relation determined bv lBalogh et alJ J2004h 
as solid black circles, with corresponding errors. We also 
show the observed (U-V) vs. Mb color-magnitude relations 
(filled circles) at z = 0.4- 1.0 (blue) and z = 1.3-3.5 (green) 
from Giallongo et al. (2005), and find reasonable agreement 
despite the much larger uncertainties at these larger red- 
shifts. The z ^ d etermination of (U - V) vs. Mb from 
'Mcintosh et al ( 2005a) also agrees with our prediction. 

We note that although our predicted (R - K) colors are not 
as red as th ose of extremely red objects observed at high red- 
shift (e.g., Roche et alJl200l iFranx et alJl200l . we are not 
attempting to reproduce this population, which is heavily in- 
fluenced by the presence of ongoing starbursts and dust red- 
dening, and possible AGN activity as is typical of e.g. low- 
redshift ultraluminous infr ared galaxies (e.g.. i Roche et al] 
.2002t IMivazaki et alJl2003t ISanders & Mirabe]|fT996ft . Our 
predictions are, however, consistent with the (R-K) colors 
of ellipticals observed by, e.g., Pozzetti et al. (2003). The 
presence of even mild dust reddening, which we do not ex- 
pect to have a large impact on most of the colors and magni- 
tudes we show, based on the rapid falloff in column densities 
post-merger (Hopkin s et alJl20 05a). will, however, strongly 
redden the (R-K) colors. It is therefore not surprising that 
our predicted, intrinsic, non-dust reddened (R-K) colors are 
too blue, and this demonstrates that reproducing these col- 
ors will require more sophisticated models which incorporate 
dust reddening in the ISM and possibly the continued produc- 
tion of dust in stellar winds. 

Our modeling reproduces the observed color-magnitude re- 
lations of red/elliptical galaxies over the range of magnitudes 
observed and for different observed colors. Furthermore, 
the typical dispersion about the mean color at low redshift, 
~ 0.2, agrees well with that observed for this population of 
galaxies (Baldry et al. 2004; Balogh et al. 2004). We predict 
the evoluti on in this d ispersion with redshift, in good agree- 
ment with .van Dokku m et al. (2000), who find based also o n 
the observations of [B ower et al. (1992), lEUis et alJ jl997h . 
and van D okkum et ah ([1.998) that the scatter in the color- 
magnitude [(U-B) vs. Mb, specifically] relation of all progen- 
itors of present early-type galaxies increases by a factor ^ 2 
between z = and z = 1 . Moreover, we reproduce the observed 
trend of increasi ngly blue c olors at higher redshift (e .g., 
iBell et alJl2004bt iCross et ^MM iGiaiiongo et alJl2()05l) as 
these galaxies have formed more recently and thus not red- 
dened as much. This is clear from the comparison with the ob- 
servations of Balogh et al. (2004) and Giallongo et al (200^ 
shown, but further, the observed "blueing" of the red galaxy 
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Fig . 8 . — Predicted luminosity function at six representative redshifts (upper left of each panel) in several bands: U (purple), B (blue), V (cyan), R (light green), 
I (green), J (yellow), H (orange), and K (red) (with generally decreasing characteristic magnitude). Lower panels show (/>*, M^^j^^, and 7band for each band, in 
the manner of Figure|5] Observed points for <f>*, Mg^^^, and /'band in U, R, and K bands are shown (filled circles of the appropriate color) from CohenI 120021) . 
at z = from Kochanek et al. ( 2001) (K, red square) and Nakamura et al. ( 2003) (r, green triangle), and at z = 0.4 — 0.9 from Zucca et al. ( 2005) (U, B, V, R, I, 
diamonds of appropriate colors). 



population is observed to be ^ 0.3 magnitudes over the red- 
shift range z ~ 0- 1 (Bell et al. 2004b). 

At high redshifts shown in Figure |9j the slope of the 
color-magnitude relation changes, and brighter objects be- 
come bluer than fainter ones. The magnitude of this change 
in slope depends on whether we adopt a pure peak luminos- 
ity evolution (PPLE) or pure density evolution (PDE) form 
for the quasar evolution at high redshifts, as shown below in 
Figure ^2 Beyond this, however, this change in slope and 
normalization owes to the fact that the most massive remnant 
galaxies form at redshifts z^2, corresponding to the observed 
peak in bright quasar activity generated in mergers. Thus, at 
high redshift, these objects have formed more recently, and 
are bluer. 

The re is some evidence for this, as, e.g., Giallong o et alJ 
( I2005h find a ~ 30% change in the slope of the (U - V) 
vs. Mb relation from z = 0.4- 1 to z = 1.3-3.5, consis- 
tent with our predictions. Still, although the observations 
do not strongly distinguish between the PPLE and PDE 
cases at this point, the weaker slope evolution seen in the 
PDE case is somewhat more consistent with the observations 
of van Dokkum et al. (2000), Bower et al. ( 1992), Ellis et al. 
([I997), and van Dokkum et alj (ll99 8). who find results con- 
sistent with no evolution in the (U -B) vs. Mb slope at red- 
shifts z = - 1, and at most a similar ^ 30-40% change over 
this redshift range. However, we caution that these samples 
are selected either by color (in which case they are obviously 
biased against a strong blueing of the high-mass population) 



or by morphology. If a considerable fraction of the most mas- 
sive galaxies are still forming (i.e. have recently merged or be- 
gun merging), they will not have relaxed and will not be iden- 
tified by either criterion. Therefore, we consider the color- 
magnitude relation derived if we ignore all objects at any red- 
shift which have formed less than 1 Gyr in the past (about the 
time it takes for significant morphological and color distur- 
bances from the merger to relax). 

Figure ^| shows our predictions with this caveat (in the 
manner of Figure |9j also assuming pure density evolution 
above z ^ 2), for z = - 1, as at higher redshifts this cut ex- 
cludes all but the objects formed at the highest, most uncertain 
redshifts. As is clear in the figure, this further reduces the evo- 
lution in the slope, with the change in slope over this redshift 
range in each color magnitude relation essentially consistent 
with zero. 

We do not explicitly model populations of "old" pre-merger 
stars (although these are included in our simulations), which 
should form in the progenitor disks before the merger Al- 
though at times long after the merger this should not be a 
significant contributor to the galaxy colors, as much of the 
stellar population is formed in a strong starburst, the effect 
could be significant for massive galaxies which have recently 
formed, reddening these objects and reducing (or even revers- 
ing) the slope evolution shown. Regardless, this slope change 
is difficult to observe, even in the absence of the strong lim- 
its to measured magnitudes and colors imposed from obser- 
vations at higher redshift, as some of these objects become 
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Fig. 9. — Predicted mean color (diamonds and lines) as a function of magnitude for several color-magnitude pairs (as labeled), with rms deviation in the color 
distribution (vertical errors). Our predictions ai'e shown for z = (black), z = 0.2 (purple), z = 0.5 (blue), z = 1 (cyan), z = 2 (gr een), z = 3 (orange), z = 5 (red), 
with bluer colors at higher redshift. Our results are compared to observations of (u-r) vs. M, at z = (black circles, upper left; Balogh et ali2Q04i) . and (U — V) 
vs. Mb from z = 0.4— 1.0 and z = 1.3 — 3.5 (blue and green circles, respectively, upper right; Giallongo et al. 2005) . Pure density evolution is assumed for the 
quasar luminosity function above z ~ 2. 




Fig. 10. — Same as Figure l9l but excluding all spheroids which have 
formed less than 1 Gyr before the observed redshift. 



blue or morphologically disturbed enough that they will not 
be classified as red/elliptical galaxies. This explains our slight 
overprediction of the very bright end of th e galaxy luminos- 
ity function at redshifts z > 1 - 2 in § 14. H and § 14.21 as these 
galaxies correspond to the rapidly blueing galaxies in these 
color-magnitude relations and will not appear in the observed 
red galaxy luminosity functions. 
As noted above, we also do not include the effects of 



dust reddening, which can become important for recently- 
formed galaxies in which star formation has not yet termi- 
nated (i.e. massive galaxies at high redshift), as our modeling 
in lHopkin s_elal. c2D05a) and obs ervations of the hig h-redshift 
massive red galaxy populations (iLabbe et alJl2005l) indicate, 
and will most likely also reduce or even reverse the plotted 
evolution in slope. However, we do not expect this to have a 
strong effect on the typical mean colors at a given redshift, ex- 
cept perhaps for the very highest redshifts where most galax- 
ies may still be actively merging. 

Despite these caveats, we can make two further predictions 
from our modeling. First, the observed bimodality in the dis- 
tribution of galaxy colors should break down at large redshift, 
especially at high luminosities, as the bright-end merger rem- 
nants become bluer Specifically, we predict, in the absence 
of strong evolution in the blue color population, that the two 
color distributions should coalesce around z 1 .5 - 2, as is 
observed by, e.g., Willmer et al. (2005) and Giallongo et al. 
(2005). Second, the fraction of red galaxies (classified on the 
basis of the z ^ Q bimodal color distribution), which domi- 
nate the bright end of the luminosity function at low redshift, 
should decrease at higher redshift (i.e. the bright end of the 
luminosity function should have an increasing contribution 
from "blue" galaxies, in reality the same as the red ellipti- 
cal remnants observed at z ^ but formed more recently and 
thus bluer), as observed by *Cross et al] J2004ft . iDaddi et'all 
(2004), and Somerville et al. (2004|! These authors find a 
fraction as large as ^ 1 /3 - 1 of these galaxies show irreg- 
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ular morphologies providing evidence for merger-driven in- 
teractions by z ~ 1 .5 - 2 .0, as we expect based on their forma- 
tion redshifts (see also Figures l20l and l22l below). This also 
explains the observations of Arnouts et al. (2005) in the far 
UV (1500 A) from GALEX and Brinchmann et al. (1998) in 
HST morphological surveys, who find that the density of un- 
obscured starburst or peculiar merging galaxies increases dra- 
matically from z = to z ^ 1 , where they begin to dominate the 
bright end of the cumulative (spiral and elliptical) luminosity 
function, as anticipated from the color-magnitude evolution 
of Figure 121 and the excess of bright blue (recently forming) 
galaxies beginning to appear at this redshift in Figure|3 This 
is also expected from our modeling of the co-production of 
quasars and spheroids, as numerous observations have found 
that the host galaxies of quasars at high redshift (which should 
relax to become normal present ellipticals) are excessively 
blue, both fr om AGN contributions and recent starburst activ- 
ity (see, e.g.'Bahcall et al."1997^,'Canalizo & Stockton"2001' 
iDunlopetal. 2003; Sanchez et al. 2004; Jahnke et al. 2004, 
and references therein). Furthermore, Labbe et al. (2005) find 
that dusty blue galaxies which are still forming stars consti- 
tute a large fraction (~ 70%) of the high-mass red galaxy 
population at z > 2 - 3, while older "dead" red spheroids 
constitute a smaller fraction ^ 30%, with ages implying for- 
mation redshifts z < 5 (accounting for a rapid quenching of 
star formation instead of ongoing star formation, see e.g. 
iFprster Schreiber et al. 2004; van Dokkum et al. 2004). 

Figure shows the color-magnitude [{U - V) vs. My 
shown] tracks with redshift, for the population of spheroids of 
fixed total stellar mass Msph = 10^ 10'", 10", and IO'^Mq, 
from right to left, respectively (i.e. decreasing magnitude 
with increasing stellar mass). In the upper left, we show 
(dashed lines) the tracks predicted by our modeling, assum- 
ing pure density evolution for the quasar luminosity func- 
tion above z ^ 2, from the bluest colors below the range 
plotted at z > 6 to the reddest colors at z = 0. The tracks 
show the mean color and magnitude of the population of 
objects at the given mass, as observed at a given red- 
shift. For comparison, we also plot t he observed z = 
[black; (U- V) w 2.1 -0.08 (Mi/ + 20)1 (Rower et al .1119921 
ISchweizer & Seitz er 1992; Terlevich et al. 200 IJ and z = 1 
blue; same slope but norma lization lower by ^ 0.4 mag) 
Bell et al.l2004bt iGiallongoet al...2005J color-magnitude re- 
lations as soUd Unes. 

The agreement with the observed color-magnitude relations 
is good. At high redshift, galaxies of all masses are still 
forming, and so the mean colors are blue, and there is no 
significant slope in the color-magnitude diagram. However, 
the peak of bright quasar activity at z ^ 2 - 3 corresponds 
to the peak in the formation of massive spheroids via gas- 
rich mergers (subsequent dry merging does not affect our 
results). Feedback from black hole growth quenches fur- 
ther star formation following a merger, and the massive rem- 
nants quickly redden. However, the typical spheroids being 
formed shift to lower masses, as quasars evolve to smaller 
characteristic luminosities with decreasing redshift, keeping 
the population blue at lower masses, and yielding the slope 
of the color-magnitude diagram. This illustrates the anti- 
hierarchical growth of both the black hole and spheroid pop- 
ulations, and their self-consistency given our model of quasar 
lifetimes to connect the two populations. 

In the upper right of Figure we show the theoretical 
result assuming pure peak luminosity evolution (PPLE) in the 



quasar population above z 2, and reproduce the pure density 
evolution (PDF) tracks (dashed lines) and points at redshifts 
z = 0, 1, 2, 3 (diamonds) for comparison. At low redshifts, 
the agreement with observations is similar. While there is a 
discrepancy at the lowest masses Msph = IO^Mq, this is both 
where the observations are uncertain and where our prediction 
is sensitive to the form of the faint-end «(Lpeak) ["(Mbh)] dis- 
tribution adopted, and, within observational uncertainty, can 
be slightly adjusted to yield agreement with the z = color- 
magnitude relation at these low masses. The evolution in 
the slope of the color magnitude relation is stronger in the 
PPLE case than the PDF case because, above z 2, the PDF 
model predicts a distribution in formation rates that decreases 
uniformly with redshift, implying that objects of any given 
mass at these redshifts have the same fractional population 
from earlier redshifts. However, the PPLF case assumes that 
the distribution of formation rates shifts to lower luminosities 
above z ^ 2 rather than uniformly decreasing, implying that 
before z ~ 2, most of the lowest mass objects were formed 
earliest while larger objects only just formed, with this trend 
reversing subsequently. Because most spheroid and quasar 
production occurs after z ^ 2-3, this is sufficient to reproduce 
the observed z = relations, but results in the stronger slope 
evolution, even a reversal in sign in the color-magnitude rela- 
tion slope at high redshifts. Therefore, our probes of the mean 
ages and in particular the age distribution of even low -redshift 
low-mass spheroids, as well as the color-magnitude relation 
at moderate and large redshifts, can constrain the evolution in 
the high-redshift quasar population. 

In the lower left of Figure we show the prediction (in 
the same manner as the upper right panel, again reproduc- 
ing the upper left panel results of our standard modeling for 
comparison), assuming a constant quasar lifetime, exponen- 
tial, or "on/off" model of the quasar light curve. The exact 
value of the quasar lifetime we chose is unimportant, as it sets 
only the normalization of the number of spheroids produced, 
not their magnitudes or color distribution. It is clear that 
such a model does not accurately reproduce the z = color- 
magnitude relation, even at moderate spheroid masses Msph ^ 
10'°- 1O"M0. This is because such modeling does not in- 
corporate strong enough 'cosmic down-sizing'; i.e. a suffi- 
ciently strong age gradient with spheroid mass, even allow- 
ing for a quasar luminosity function with strong "luminosity- 
dependent density evolution" as e.g. the Ueda et al. (2003) 
luminosity function adopted here. 

The lower right panel shows our predicted color-magnitude 
diagram neglecting black hole feedback in galaxy mergers. 
As demonstrated by Springel et alJ (|2005^, mergers without 
black hole feedback result in much weaker heating of the gas 
in the galaxy, so that star formation continues, declining in 
a roughly exponential manner over a Hubble time, as found 
in simulations without black holes by e.g. Mihos & HernauisJ 
^1994. .1996 ). Therefore, we can approximate the prediction 
in a model neglecting black hole feedback by allowing for an 
exponentially declining star formation rate after a peak cor- 
responding to the phase of quasar activity. We assume the 
timescale for exponential decay is ^ 1 Gyr, similar to that es- 
timated in simulations neglecting black hole feedback, and 
demand that the stellar mass after multiple e-foldings is that 
given by e.g. our MsH-Msph relation (although this choice 
only weakly effects our results, so long as the Mbh -Msph rela- 
tion holds at least approximately after 1 or more e-foldings 
in the star formation rate). The primary result of this is indi- 
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Fig. 1 1 . — Predicted mean (U—V) color and My magnitude (circles) with rms dispersion in color and magnitude (vertical and horizontal errorbars, respectively) 
as a function of redshift at z = (black), z = 1 (blue), z = 2 (green), and z = (red), for galaxies with total stellar mass Msph = 10', 10'", lO'l, and IO'^Mq, 
from right to left respectively. Our standard modeling, assuming pure density evolution (PDE) in the quasar population above z ~ 2, is shown in the upper left, 
with dashed hnes showing the full color-magnitude tracks from z = to z > 6. The dashed lines and PDE points from the upper left are reproduced in the other 
panels (diamonds), which show the mean color and magnitude with redshift assuming pure peak luminosity evolution above z ^ 2 (upper right), adopting a 
constant quasar lifetime or expone ntial quasar light curve (lower left), or ignoring bl ack hole feed back in mergers (low er right). Solid line show the observed 
color-magnitude relations at z = IBower et alJll992tlSchweizer & Seita;iiil992tiTerlevich et aljBOOlB and z = 1 nBell et al JI2004R iGiaUongo et alj|2005ft in 
black and blue, respectively. 



cated in the lower right panel of the figure, namely that the 
galaxies are much too blue (by ^ 1 magnitude), and do not 
develop the characteristic slope of the color-magnitude rela- 
tion. This demonstrates the dramatic importance of black hole 
feedback, as the rapid quenching of star formation both al- 
lows remnants to redden sufficiently and enables the gradient 
in formation age with mass to produce a slope in the color- 
magnitude relation, as opposed to its being "washed-out" by 
continued star formation in hosts of all masses, regardless of 
the peak in their star formation histories. 

Figure^]shows the predicted colors of remnant spheroids 
as a function of spheroid stellar velocity dispersion and red- 
shift (assuming pure density evolution above z ^ 2). We con- 
sider the colors SDSS (g-r) (upper left) and (r-i) (upper 
right) and the standard (U-B) (lower left) and (R-K) (lower 
right) colors. For the (g-r) and (r -i) colors, we compare 
to the color-fj relations observed by iBernardi et alJ J2003cl 
|20O5) (filled circles) at z = (black) and z = 0.2 (purple). Both 
the z = mean colors and their evolution at low redshift are re- 
produced by our modeling, but this is not trivial even given the 
Mbh - relation and fundamental plane, as for example the 
scatter in color is not equivalent as a function of luminosity or 
velocity dispersion. The dependence on velocity dispersion is 
also reasonably well described, with our prediction within la 
of the observations over the range of velocity dispersion for 
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Fig. 12. — Predicted mean color (diamonds and Hnes) as a function of 
spheroid stellar velocity dispersion, in the manner of Figure |9] Again, our 
predictions are shown for z = (black), z = 0.2 (purple), z = 0.5 (blue), z = 1 
(cyan), z = 2 (green), z = 3 (orange), z = 5 (red), with bluer colors at higher 
redshift. At z = and z = 0.2, our predicted (g-r) and (r—f) color s are com- 
pared to those observed as a function of velocity dispersion in.B ernardi et alj 



which they exist. 
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Fig. 13. — Same as Figure IT2I but adopting tlie maximal dependence of 
total metalUcity on age and velocity dispersion from lj0rgensen et alJ1999D . 



The weak variation in these colors with velocity dispersion, 
however, means that the small effects of a systematic depen- 
dence of total metallicity on velocity dispersion or age may be 
important. We show the consequences of such a dependence 
in Figure [O] where we repeat the modeling of Figure but 
adopt a scaling of metallicity with age (here we mean z = Q 
age, i.e. formation redshift) and velocity dispersion. To esti- 
mate the maximum effect, we consider a metallicity depen- 
dence following the strongest scaling of [Fe/H] with age and 
velocity dispersion found by J0r2ensen et al. ( 1999), namely 
[Fe/H] = -0.46 log(age/Gyr)+0.33 log(g/kms-' )-0.30. We 
choose this scaling as opposed to others (e.g., iKuntschneJ 
|2p00) because it includes both the variation with age and ve- 
locity dispersion, but we find similar results neglecting the 
dependence on age. The resulting color-magnitude relations 
are steepened, and their slopes agree well with the observa- 
tions. The colors change by a negligible amount at the ap- 
proximate zero-point of the observations at cr ^ 200 kms"', 
because here the offset of the color-magnitude relation is de- 
termined by the ages of the spheroid populations alone, and 
agrees well as in Figure [21 Also, although the agreement in 
slope appears improved, we note that the effect is still small, 
generally < 0.05 mag in a given color even at the extreme val- 
ues of (T w 30, 1000 kms"' plotted (except for the high-a end 
of the (R-K) colors, which are discussed above in greater 
detail). 

This is an approximate upper limit, for example the other 
determinations within J0r2ensen et al. (1999) yield smaller 
logarithmic slopes of metallicity with a, e.g. ^ 0.07 as op- 
posed to the 0.33 shown. That this is a still small effect and 
further that it serves to bring our predictions into better agree- 
ment with observations, suggests that we are safe in neglect- 
ing it in other predictions. However, with improved observa- 
tions of the color-(T variation, the distinctions between the pre- 
dictions in e.g. Figure [T2I and Figure^jcould be significant 
enough to constrain the strength of the metallicity evolution 
allowed or required. 

We find that the scatter in colors at a given a is typically 
smaller than that at a given magnitude. In § below, we 
demonstrate that this is a consequence of the fact that velocity 
dispersion is directly related to the black hole masses forming 
over cosmic time, whereas the z = magnitude mixes systems 
of different masses and ages (and thus different colors) at the 




Fig. 14. — Predicted (U -V) vs. Mb color-magnitude relation at redshifts 
z = 0, 0.5, 1, and 2, as labeled. In each panel, 1000 galaxies are generated 
according to the predicted joint color-magnitude distributions at the given 
redshift. Black points show galaxies older than 0.5 Gyr, red points younger. 
In the upper left, the solid line shows the best-fit color-magnitude to our pre- 
dictions, the dashed line the best-fit to the observed galaxies from Bell et^ 
( 2004b); Giallongo et al. i2005). In the upper right and lower left, the solid 
fine shows the observed color-magnitude relation of Gi allongo et al. i2005i) . 
dashed line the observed relation of lBell et ali I2004bft . 



same observe d luminosity. Observationally, Bernardi et ajj 
J2003dl2005h also find that these correlations have small scat- 
ter, similar to our predictions, and argue that they are tighter 
and may represent a more fundamental correlation than, e.g. 
the color-magnitude relations. We also note that the qualita- 
tive behavior of colors as a function of velocity dispersion and 
redshift is similar for each of the colors considered, although 
different colors are rescaled about different values, and the 
evolution in the slope of the color-cr relation is much weaker 
than that of the color-magnitude relation. These properties 
make the color-velocity dispersion relation a valuable probe 
not just as a check on the color-magnitude relation but poten- 
tially as a measurement independent of some systematics (for 
example, the common observational assumption of constant 
slope with redshift in this case appears quite reliable). 

Finally, we use our modeling to generate an observed color- 
magnitude relation in Figure At each redshift considered, 
we calculate the joint probability distribution in both color and 
magnitude based on our predicted history of spheroid forma- 
tion prior to that redshift (i.e. the color distribution at a given 
magnitude in Figure |9j and distribution in magnitudes from 
our predicted luminosity functions in e.g. Figure|3, and gen- 
erate 1000 points (mock galaxies) according to that probabil- 
ity distribution. These are not full simulated galaxies, but ran- 
dom points drawn from our calculated joint PDF in color and 
magnitude at each redshift. At z = 0, we directly fit the gener- 
ated points to a color-magnitude relation, and show the result, 
(f/-y) = 1 .9-0.04 (Mb-I-20) as a soHd black line. Our result is 
similar to the observed relation, (f/- V) = 2. 1 - 0.08 (Mb -I- 20) 
from Bell et al. (2004b) and Giallongo et al. (2005), as is the 
absolute distribution in magnitude and color. We show galax- 
ies older than 0.5 Gyr as black points, and galaxies younger 
than this as red points. This demonstrates that very young 
galaxies are not a significant contributor to the observed red 
galaxy population at low redshift, and thus the fact that they 
lie in a more blue, brighter region of color-magnitude space 
than the "normal" relaxed elliptical population, as well as 
most likely being disturbed systems which would not be mor- 
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phologically recognized as ellipticals, is not important in our 
calculations at low redshift. The removal of these points 
at z = does not change our results significantly, except to 
slightly steepen the fitted color-magnitude slope to -0.06, in 
better agreement with that observed. 

At z = 0.5 and z = 1, the fractional "young" population is 
still relatively small, although it does increase, and the ob- 
served color-magnitude relations still agree well with our pre- 
dicted distribution of "old" elliptical colors and m agnitudes. 
We show the observed color-magnitude relations of'Beir et alJ 
(|2004b), who assume a constant slope at all redshifts, at 
these redshifts as dash ed lines, and the observed relation of 
iGiallongo et alJ i2005 t). who allow the slope to vary, as solid 
lines. As shown in Figure^l we reproduce the observed evo- 
lution in the red/elliptical color-magnitude relations if we re- 
strict ourselves to the older spheroids which have had suffi- 
cient time after their progenitor gas-rich mergers to relax and 
be recognized as ellipticals by either color or morphological 
selection criteria. By z = 2 (lower right), however, the fraction 
of young objects becomes quite large 0.5), as observed and 
discussed further above and in §Q 

6. SPHEROID MASS TO LIGHT RATIOS AND LUMINOSITY-SIZE 
RELATIONS AS A FUNCTION OF MASS AND REDSHIFT 

Figure [Tsl shows our predicted M/L ratio in the B-band 
{M /Lb) as a function of spheroid mass. For each redshift, 
we use our modeling of n(MBH), n(A^sph) from the quasar lu- 
minosity function to determine the distribution of ages for 
spheroids of a given mass at that redshift, and from that de- 
termine the distribution of M/L ratios in a given band. The 
masses shown are M^-„, the virial mass within the effective 
radius (= Sa^Re/G, as defined in § I2.3> . in order to ease com- 
parison with observations (which generally adopt this choice; 
those that do not have been rescaled accordingly). 

Our z = prediction is compared to observations 
of spheroid s in the Coma cluster (at z = 0.023) from 
LTcirgensen e t al. ( 1995a b, 1996) (black circles), which are 
similar to recent determinations from the SDSS and other 
studies (e.g. [van der Wei et al. 2005; Cappellari et al. 2005). 
The z = 0.3 result is compared to observations of the cluster CI 
1358H-62 at z = 0.33 from Kelson et al. (2000) (black squares). 
Our z = 1 prediction is compared to several different obser- 
vations, including those from 0.6 < z < 1.15 in the Chandra 
Deep Field-South sample of van der Wei et al. (2005) (cyan 
stars), the z= 1.237 cluster RDCS 1252.9-2927 sample of 
[Holden et al. (2005) (purple squar es), the z = 1.27 clu ster 
RDCS J0848H-4453 galaxies from Ivan Dokkum & StanfordI 
ll2003i,) (red x's), th e z = 0.83 cluster MS 1054-03 sample of 
IWuvts et alJ ( 12004 (blue triangles), and t he 0.8 8 < z < 1.3 
K20 sample of Idi Serego Alighieri et all ( l2005h (green cir- 
cles). In each panel at z > 0, we show the z = mean M/Lg 
prediction for comparison (dotted lines). We also show our 
predicted mass to 1-band light ratios M/L/ as a function of 
mass in Figure^] in the same manner as Figure [21 demon- 
strating the relative importance of different age distributions 
in different observed wavebands. 

Our modeling reproduces the typical M/Lg ratios and their 
dependence on mass, and the scatter about the mean M/Lg, 
which increases significantly with increasing redshift and de- 
creasing mass. Although for clarity we have not shown other 
redshifts, w e have compared e.g. the z = 0.58 MS 2053-04 
sample of Wuvts et al." ("2004") to our predictions and find sim- 
ilar agreement. Our modeling further predicts the observed 
differential evolution in M/Lg, where the mass to light ratio 



declines more rapidly with redshift above z = in smaller- 
mass sy stems, implying that these formed mo re recently 
(see, e g.'Treu et al.4oOlVvan Dokkum et al.'200lVTreu et aJJ 
120021 Ivan Dokkum & Stanford 2003; Gebhardt et al. 2003 
Rusin et al. 2003; van de Ven et al. 2003; Wuvts et al. 200i 
Treu et al. 2005; Holden et al. 2005; van d er Wei et al. 200a 
di Serego Alighieri et al. 2005). At z > 2, our model agrees 
well with the observations, for example the mass-to-light ra- 
tio as a fun ction of mass in the K-band of distant red galax- 
ies found bv lLabbe et all (12005). which may even observe the 
flattening in the M/L relation we predict for z > 2-3, al- 
though it is difficult to determine this given luminosity lim- 
its at these high redshifts. These observations suggest that 
many of the most massive galaxies are forming at this red- 
shift, with ^ 70% of the population being blue, dusty galaxies 
still forming stars at a high rate (Labbe et al. 2005), as we ex- 
pect (see §Qfor a more detailed discussion) and a fraction of 
the most massive galaxies formed as early as z '--^ 5, although 
this age is lower than estimated in e.g. Labbe et al. (2005) 
if we account for the rapid quenching of star formation seen 
in our simulations in modeling the stellar populations (e.g. 
iForster Schreiber et al. 2004; van Dokkum et al. 2004). 

Our modeling suggests that the M/Lg relation should 
steepen below M ^ a few x 1O'"M0, where at low redshift, 
samples are severely limited by luminosity/magnitude limits, 
making the differential evolution slightly less dramatic. How- 
ever, we caution against interpreting this curvature too strictly, 
as it depends on both the functional form and quantitative de- 
pendence of the quasar luminosity function break luminos- 
ity on redshift. In our adopted form for the quasar luminos- 
ity function, the break luminosity evolves exponentially with 
lookback time, in which case the degree of curvature is quite 
sensitive to the coefficient of this exponential growth, whereas 
if e.g. we considered exponential evolution in redshift (instead 
of lookback time), we obtain similar values of M/Lg at small 
and large M, but with a less curved power-law interpolation 
between them. 

To illustrate the impact of selection effects, we plot (dashed 
lines) the lower observable mass limit for a limiting luminos- 
ity of 10"' L© (left) and lO^L© (right) in each panel. The 
scaling we describe in § I2.3l between virial and stellar mass 
within the effective radius (or stellar mass and effective ra- 
dius) is a non-negligible component of the z = slope of the 
M/L ratio - ignoring this scaling does not change our predic- 
tions at the high-mass end, but results in an overprediction of 
the M/L ratio at the low-mass end by a factor ~ 2. However, 
the redshift evolution is almost entirely a consequence of the 
different ages of spheroids of different mass; our predictions 
for the differential M/L evolution with redshift are essentially 
identical if we neglect the weak evolution in the Msph-/?e re- 
lation with redshift described in Robertson et al. (2005c, in 
preparation). 

Differential evolution in the M/Lg ratio is expected in our 
model because the break in the quasar luminosity function 
shifts to lower luminosities below z ~ 2 - 3, implying that 
spheroids with smaller black hole mass (smaller peak lu- 
minosity) are dominating the distribution of objects being 
formed at these later times. Therefore, at z ~ 1 , the lower mass 
objects have formed more recently. However, above z ~ 2 - 3, 
this differential evolution should either flatten or reverse, if 
a pure density or pure peak luminosity evolution model of 
the quasar luminosity function is an accurate description of 
quasar activity. The results in Figure^jassume pure density 
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Fig. 15. — Predicted mean mass to B-band light ratio M/Lb (solid lines) as a function of spheroid mass M, with \cr dispersion at each M (yellow shaded 
region). Results are shown for z -0 (upper lef t), z = 0.3 (upper right), z = 1 (lower left), and z - 3 (lower right), as labeled. Observa tions at z f» (black 
circles) are from'J0rsensen et al. ( 1995a b, 1996), ztz^ 0.3 (black squares) from Kelson et al. ( 2000), and at z Ri 1 from van der Wei et al. (2005) (cyan stars), 
IHolden et al. 1 2005 ) (purple squares), van Dokkum & Stanford 1 2003 ) (red X 's), Wuvts et al. ( 2004 ) (blue triangles), and di Sereso Alig hieri et al. 1 2005.) (green 
circles). Luminosity limits of lO'" Lq and lO' ' Lq are shown in each panel (dashed lines), as is the z = mean M/Lg (dotted Hues). 
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Fig. 16. — As Figure fTsl but our predictions are shown for the mass to 
I-band mass to light ratio M/L;. 



evolution in the quasar luminosity function above z ~ 2. In 
this case, above z ^ 2, the shape of the luminosity function 
(and therefore, the distribution of peak luminosities and cor- 
responding spheroid masses being formed) remains constant, 
and the normalization decreases with higher redshift. Thus, 
all objects have the same distribution of formation ages above 
this redshift (with only second-order effects from the finite 



quasar lifetime and merger time, at least until high redshifts 
where these times become comparable to the Hubble time). 
Therefore, the slope of M /Lb vs. M should become flat (ex- 
cept for the small effects of the Msph relation), as seen in 
the figure for z = 3. 

In a pure peak luminosity evolution scenario, the shape of 
the quasar luminosity function above z 2 again remains 
roughly constant, but instead of decreasing in normaliza- 
tion, the break luminosity shifts to smaller luminosities at 
higher redshifts, with constant normalization. This implies 
that, above z ^ 2-3, the more massive objects have actually 
formed more recently, and so the slope of the M /Lb vs. M re- 
lation should be inverted, i.e. that M /Lb should decrease with 
mass. However, if metallicity evolves with either mass or red- 
shift, this will affect the mean mass to light ratio and slope 
as well, although we discuss this effect above and show in 
Figure^lthat it is small. 

We also test whether the distributions of spheroid mass to 
light ratios inferred from idealized models of the quasar life- 
time are consistent with observations. We consider a case in 
which quasars have a fixed, constant lifetime and radiate at 
a fixed luminosity L = Lpeak- Here, the value of the quasar 
lifetime is unimportant, as it controls only the normalization 
of the resulting rates of spheroid format ion. We adopt the 
luminosity function of lUeda et al.l ( 120031) . from the hard X- 
ray, modified for pure density evolution above z = 2 follow- 
ing Fan et al. (2001), although our resu lts are quaUtative ly in- 
sensitive to these specific choices dHopkins et alJl2005d) . We 
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Fig. 17. — As Figure [Tsl but assuming a "light-bulb" ("on/olf") or pure 
exponential model of the quasar light curve and lifetime. 



have already demonstrated in Figure|4]and FigureE]that such 
modeling predicts a spheroid luminosity function and color- 
magnitude relation in stark disagreement with observations. 

Figure shows the predicted B-band mass to light ratios 
M /Lb as a function of mass at redshifts z = 0, 0.3, 1, 3, in 
the same manner as Figure and with the same observa- 
tions shown, but adopting this idealized model for the quasar 
light curve. The predicted mass-to-light ratio is too high by 
a factor ^2-5 at all but the largest masses, and shows al- 
most no dependence on mass at any redshift, and no differ- 
ential evolution from z = to z = 1 . Although both the color- 
magnitude relation and mass to light ratios derive from the 
same underlying age distribution, the distinction between the 
predictions of our full model of quasar activity and idealized 
models is significantly stronger in the predicted mass to light 
ratios t han color rn agnitude relations (Figure II 1> . We note 
that the 'U eda et alJ ll2003 ) luminosity function does include 
"luminosity-dependent density evolution," in which the slope 
of the faint-end quasar luminosity function evolves with red- 
shift, implying that the density of lower-luminosities quasars 
peaks at lower redshift. This is the only reason, in fact, 
that there is any dependence of M /Lb on M at all in Fig- 
ure[n] Although this is qualitatively consistent with the anti- 
hierarchical, downsizing picture implied by the observations 
described above, the figure demonstrates that it is quantita- 
tively insufficient to account for the downsizing observed in 
the spheroid population. 

At high redshifts z ~ 2, traditional models of the quasar 
luminosity function associate an observed luminosity with a 
quasar's peak luminosity, implying that many low-peak lumi- 
nosity (i.e. low final black hole mass and, correspondingly, 
small spheroid mass) systems are forming at these redshifts. 
Even if the inferred formation of these objects reaches a max- 
imum at somewhat lower redshift, they are still formed over 
a wide range of redshifts with a large number of the smallest- 
mass systems formed at z ~ 1-3. However, in our model 
these observed faint-end objects are really brighter peak lumi- 
nosity sources, in a dimmer stage of their evolution; the distri- 
bution of peak luminosities being formed at a given redshift 
is actually peaked, at a luminosity corresponding the break 
in the observed luminosity function. Thus, low peak lumi- 
nosity systems (small spheroid masses) are not formed un- 
til much later times, when the break luminosity has evolved 
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Fig. 18. — Predicted luminosity-size relation (in V-band) at several red- 
shifts, as labeled. The mean luminosity-size relation (black lines) and Icr 
range (yellow shaded area) are shown. Dotted lines in each panel show the 
mean z = relation. Observations at z = (squares) are from Shen et a^ 
120031) . with horizontal error bars showing the dispersion in R^, at each con- 
stant My . Observation s aX z> are from .Tmiillo et aL i .20Q.'5.. circles) and 
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to small luminosities. In fact, in our modeling, the observed 
change in quasar luminosity function slope is actually a con- 
sequence of the quasar lifetime as a function of luminosity, 
while t he break luminosity ev olution reflects "cosmic down- 
sizing" ( Hopkins et a l. 20 05f). 

As discussed in § 12.31 above, Robertson et al. (2005c, in 
preparation) analyze scaling relations for merger remnants 
and their implications for the fundamental plane. However, 
that work considers only the structural properties of individ- 
ual objects and does not predict the age distribution of any 
population. Here, we determine the distribution of spheroid 
ages as a function of e.g. stellar mass, and combine this with 
knowledge of the detailed structure of the remnants to pre- 
dict the observed luminosity-size relations as a function of 
redshift in bands where mass to light evolution is important. 
For present purposes, we emphasize that our simulations re- 
produce well the observed z = effective radius-stellar mass 
relation of remnant red/elliptical galaxies (e.g. Bernardi et al. 
2003a, Shen et al. 2003; Padmanabhan et al. 2004; Cappel- 
lari et al. 2005), as well as predicting that this relation should 
evolve at most weakly with redshift, in agreement with obser- 
vations (Truiillo & Aauerri 2004; Trujillo et al. 2004, 2004 
( Mcintosh et aL 2005b) (see also e.g. Fergus on^L_aL 2004t 
Bouw ens et alJ2 004: Papovich et al. 2005, although these au- 
thors do not separate the relation by morphological type). 
Given a nearly redshift-independent - M^ph relation, it is 
then straightforward to convert our predicted mass-to-light ra- 
tios as a function of mass to a luminosity-size relation (lumi- 
nosity as a function of effective radius). This then enables a 
secondary means of measuring the relative ages and differen- 
tial evolution of the remnant spheroid population, which in 
many cases probes different regimes in size and redshift. 

Figure^]shows the resulting predicted luminosity-size re- 
lation (in V-band) at several re dshifts. We compa re to obser- 
vations at z = (squares) from lShen et al.l J2003h . with hor- 
izontal error bars showing the dispersion in Rg at each con- 
stant My- These observations are converted from the r-band 
using our predicted color-magnitude relations (§ O, which 
further impUcitly guarantee that we reproduce the observed 



20 



Hopkins et al. 




Fig. 19. — Predicted luminosity-size relation (in V-band) as a function of redshift. Upper panels show the absolute (left) and relative (normalized to the z = 
value, right) effective radii (and Icr range of radii, vertical en'or bars) as a function of redshift, at fixed luminosity (My = —18, —20, —22, -24, as labeled). 
Lower panels show the absolute (left) and relative (right) V-band magnitude (and 1 cr range of magnitudes, vertical error bars) as a function of redshift, at fixed 
effective radius (Rf = 1, 2, 5, lOkpc, as labeled). 



luminosity-size relation in all other wa vebands. Further ob- 
servations at each z > are shown from lTruiillo 
circles) and Mcintosh et al. (2005b, x 's). 

Our modeling reproduces both the mean luminosity-size re- 
lation at each redshift, as well as the range of Re at fixed lumi- 
nosity as a function of luminosity (compare the z = disper- 
sions from Shen et al. (2003) and our modeling). For the ob- 
served redshift ranges, the effect of the change in the Mjph 
relation with redshift in our simulations is small, for example 
at fixed M^ph = 10'*' M0 the effective radius decreases by just 
25% from z = to z = 2, and the evolution in the luminosity- 
size relation is driven primarily by evolution in mass-to-light 
ratios owing to different spheroid ages as a function of mass 
or size. 

We show the evolution with redshift of both effective ra- 
dius at fixed luminosity and luminosity at fixed effective ra- 
dius in greater detail in Figure The points at each mag- 
nitude are offset by a negligible amount for clarity. Although 
the interpretation is not as straightforward as that of our mass- 
to-light ratio predictions, the more rapid and pronounced rela- 
tive magnitude evolution of systems with larger effective radii 
is a reflection of the same anti-hierarchical growth discussed 
above (and below in § 0, with larger (higher-mass) systems 
forming at higher redshift. 

7. GALAXY AGES AS A FUNCTION OF MASS AND LUMINOSITY 

Figure |20| shows the fraction of all z = spheroids of a 
given stellar mass formed by a given redshift, as a func- 
tion of redshift for spheroid stellar masses = Mjph = 



10^ 10'", 10", and IO'^Mq. Given the anti-hierarchical na - 
ture of black hole growth described in 'Hopkins et al. ("2005^1, 
where the highest-mass black holes are formed at high red- 
shifts, associated with the peak in bright quasar activity, and 
lower mass black holes are formed at lower redshift as the 
break in the observed quasar luminosity function (correspond- 
ing to the peak in the formation rate of final black hole masses 
«(Mbh)) moves to lower luminosities, we expect the trend in- 
dicated, where higher-mass spheroids are formed at higher 
redshifts and over a wider range in redshift, as these corre- 
spond to higher-mass black holes. 

This evolution in black hole mass explains the observations 
of Bernardi et al. (2003c, 2005), who find that color is primar- 
ily correlated with velocity dispersion (see Figure fT2l . with 
the color-magnitude relations discussed above being a con- 
sequence of the fact that magnitudes are also correlated with 
velocity dispersion. Based on the quasar luminosity function, 
the dispersion in ages for a given cr is small, as black holes of a 
given mass form over a well-defined range of redshifts. Since 
feedback from black hole growth results in passive evolution 
of the remnant after quasar activity, the age (and therefore 
reddening) of the remnant is correlated more tightly with the 
velocity dispersion (i.e. black hole mass) of the remnant than 
its luminosity (magnitude), which mixes galaxies of different 
black hole masses and ages. 

Figure|20|also shows the fraction of all z = spheroids of a 
given B-band magnitude formed by a given redshift. Unlike 
the fractional population vs. redshift as a function of mass, 
this includes the effects of stellar evolution, effectively mix- 
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Fig. 20. — Predicted fraction of z = spheroids witli stellar mass M» formed by a given redshift as a function of redshift, for M* = 10', lO'", lO", and lO'^M©, 
as labeled (left). Right panel shows the same, but for spheroids observed at z = with a given B-band magnitude Mb = —16, —18, —20, —22, —24, as labeled. 
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Fig. 21. — Predicted z = ages of spheroids as a function of velocity 
dispersion. Black solid line shows our predicted median age, yellow range 
shows the interquartile range of ages at each a. Observations of mean age 
(and dispersion about the mean, vertical error bars) in bins of logo- are 
shown from Nelan et al. 1 2005) (circles) and Caldwell et al. 1 2003) (squares), 
and the ~ Icr range of fitted age-cr relations from Kuntschner et al. i2001) 
(dark blue line with triangles), Trageret al. (2000) (red line with stars), and 
|j0rgensea 1. 1 999.^ (light blue line with diamonds) are shown as solid lines. 



ing e.g. older, more massive galaxies with younger, less mas- 
sive ones that have the same z = B-band luminosity. Despite 
this, the trend of higher luminosity objects forming at charac- 
teristically larger redshifts and over a wider range of redshifts 
is clear The flattening of the lowest-luminosity population 
growth at z ^ 1 is a consequence of the pure peak luminosity 
evolution model for the quasar luminosity function evolution 
at z > 2. With pure density evolution above z ~ 2, the low- 
est luminosity curve will continue to fall rapidly, without a 
significant number of very low peak luminosity (low spheroid 
mass) systems forming at high redshift. 
In Figure]^ we plot the predicted z = ages of spheroids 



as a function of velocity dispersion from our modeling (as- 
suming pure peak luminosity evolution at z > 2, although this 
only becomes important here at very low a where the range 
of ages is relatively large in either case). Observations of the 
mean age in bins of log a are shown from Ne lan et al.l (120051) 
(circles) and Cal dwelLet al. (2003) (squares), with horizon- 
tal errors showing the range of logcr of each bin and ver- 
tical error bars showing the rms dispersion in ages at the 
given velocity dispersion (which can be compared to the yel- 
low range plotted). The ^ la range of fitted age-cr relations 
(i.e. adopting the minimal and maximal fitted age-cr slopes) 
from Kuntschner et al. (2001) (dark blue li ne with triangles) , 
llYager et aL(.2OO 0) (red line with stars), andlJeirgensenl (1 1 999h 
(light blue line with diamonds) are shown as solid lines. The 
slopes from the observations of Kuntschner et al. (2001) and 
Traser et al. (2000) are determined by fitting in Nelan et afl 
(2005). 

The agreement at all values of a is good, again implying 
that the downsizing of both galaxy and quasar populations 
is self-consistent when our model of the quasar lifetime is 
adopted, and emphasizing that age evolution as a function of 
velocity dispersion or stellar mass is important along the red 
sequence (i.e. that the red sequence is not merely a metallic- 
ity sequence). There is a slight systematic offset in the mean 
age, with several of the observations estimating ages ~ 1 Gyr 
larger than those we predict, but this is well within the un- 
certainties of both our theoretical modeling and observational 
estimates of absolute ages. 

Our prediction of the age-velocity dispersion relation in- 
cludes the observed steepening of the relation at low velocity 
dispersions (e.g., Caldwell et al. 2003; Nelan et al. 2005), an 
effect not accounted for in fitting a single power law, which is 
why the power law fits extrapolated to low a tend to predict 
larger ages than given by either our prediction or the binned 
observations. There is also a suggestion that the dispersion 
in age becomes larger at low velocity dispersion, an effect 
discussed in detail in § |5l and potentially seen in some ob- 
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servations (e.g. iNelan et'al]l2005h . but the observations are 
still uncertain on this point and as shown in regard to the 
color-magnitude relation, this effect can be quite sensitive to 
whether pure peak luminosity or pure density evolution is as- 
sumed for the quasar population at high redshift. 

Figure |22lconsiders the population of very recently formed 
spheroids, which will not yet be relaxed or reddened and may 
be identified as either peculiar or interacting galaxies. We de- 
termine the fraction of spheroids with ages less than 0.5 Gyr 
(upper panels) or 1 .0 Gyr (lower panels), as a function of red- 
shift. In the left and right panels, we show this prediction as- 
suming either pure density or pure peak luminosity evolution 
of the quasar population above z ~ 2, respectively. The pre- 
dictions are similar for a limiting age of both 0.5 and 1.0 Gyr; 
that our predictions are not strongly sensitive to the spheroid 
age in this regime suggests that this can be observationally 
measured via relatively simple diagnostics. Clearly, direct 
measures of the population of merging and interacting galax- 
ies probe the fraction of galaxies with very recent formation 
times, but by ^ 1 Gyr, many of these objects may be identi- 
fied not through more difficult morphological analysis but e.g. 
through spectral classification as K+A galaxies. 

Moreover, the fraction of young objects at e.g. z = 3 is sen- 
sitive to the strength of the density evolution modeled, which 
allows observations of the distribution of spectral types as a 
function of redshift to not only test our modeling but also to 
constrain the form of high-redshift quasar evolution. While 
at very low redshift the results are similar, the prediction that 
the fraction of young objects should be higher in low-mass 
spheroids reverses rapidly at z ~ 1 in the pure peak luminos- 
ity evolution case. This distinction should allow even rough 
observations of the fraction of K+A vs. A galaxies at z > 1-2 
to break the observational degeneracy between pure density 
a nd pure peak lurninosity evolution. 

'Be rnardi et al.l (1200 3c'') find from the color and chemical 
evolution of SDSS elliptical galaxies that these galaxies are 
passively evolving at redshifts z ^ 0.5, and t hat they (on aver- 
age) formed ~ 9 Gyr in the past. .Bernardi et al.l J2003allJ) 
determine the same characteristic age independently based 
on analyses of the fundamental plane and z = galaxy scal- 
ing relations. This corresponds to a redshift of formation 
z ^ 1 .5, consistent with our predictions for the formation red- 
shifts of massive red galaxies. This age also makes it clear 
that the peak elliptical galaxy formation occurs contempora- 
neously with peak quasar activity at z ^ 2, which is explained 
if spheroids and quasars form together. 

This is also consistent with direct observations of the mor- 
phologies of galaxies, which show that by z ^ 0.7 red galax- 
ies are almost all relaxing ellipticals, with little contribu- 
tion to observed luminosity from e.g. dusty spirals iBell et al.l 
l2004ah . |R)ntana et al. (2004) also find similar results from 
studying ellipticals in the K20 survey; namely, that massive 
ellipticals evolve passively for z < 0.7, with little growth in 
the total mass density in spheroids. However, at z > 1, the 
mass growth in ellipticals rises steeply, with most mass as- 
sembly at z ^ 1-2. Specifically, they estimate ^ 1/3 of 
the present mass of massive ellipticals has been assembled 
recently by z ^ 2, in agreement with our predictions for the 
evolution of the stellar mass function and ages (Figuresl3ll20l 
andl22>. They further find that for z > 1, the z = popula- 
tion of massive ellipticals becomes increasingly dominated by 
star-forming galaxies, as expected in a merger-driven scenario 
for c ontemporaneous s phero id a nd quasar formatio n. Like- 
wise, ISomerville et"al] (l200l and lDaddi et all (l2()0l observe 



that at z ^ 1.5-2, the massive elliptical population includes 
large numbers of highly di sturbed morphologi es indicative of 
merger-induced starbursts. ICross et aL 1 (12001 find from fun- 
damental plane analyses that the production of massive red 
ellipticals should increase with cosmic time to a peak at z ~ 2 
and then fall, suggesting that this is the epoch of peak massive 
spheroid formation. This is also supported by direct observa- 
tions of quasar host galaxies, which find strong evidence for 
simultaneous and strongly associated black hole growth and 
star formation at redshifts corresponding to peak quasar ac- 
tivity (z > 1) (e.g., Alexander etal, 2 

Many observations indicate that galaxy a ge increases with 
velocity dispersion or spheroid mass (e.g.. '.Teirgensenl ll999t 
fTrageretal. 2000; Kuntschner et al. 2001; Caldwel letal] 
2003; Fontanaetal. 2004; Bernardi et al. 2005; Faber et ajJ 
2005; Howell 2005; Tanaka et al. 2005; Gallazzi et al. 20051 
Nelan et al. 2005), as we have considered in Figure 1211 
iGallazzi et al.. (i2005.) also quantify this trend in terms of stel- 
lar mass, finding that galaxies with mass ~ 10^-10'^Mq 
form at redshifts z ~ 1.5-2, with median age increasing sys- 
tematically with mass; they estimate e.g. ^ 16% of lO'^M© 
galaxies (at which point their sample is spheroid-dominated) 
are in place by z 2, rising to ^ 50% at z = 1.8, similar to 
our predictions in Figure |^ This is a consequence of the 
strong anti-hierarchical black hole growth implied by our in- 
terpretation of the quasar luminosity function, where higher- 
mass black holes (thus higher-cr spheroids) form at higher red- 
shift z ^ 2, and thus we reproduce both the mean age of z = 
spheroids and its evolution with velocity dispersion and mass. 
These authors also find that higher velocity dispersion does 
not imply strongly decreasing metallicity, which is consistent 
with our picture of rapid metal enrichment (even at high red- 
shift) in the starburst phase of the merger. 

Our results are consistent with ages inferred from 
fundamental plane analyses (e.g., van Dokkum & Franj 
1996; M-gensen et al. 1996, 1999; van Dokkum et alj 
1^98, 1999, 2000, 2001; Treuetal. 2001, 2002 
Gebhardt et al.. .2003: Cro ss et al. 2004; Wuvts et al. 2004 
Ven et alJ 120031). c olor and spectral analyses (e.g.. 
Bower et al.' '1992'; 'ElHs et al. 1997; Bernardi et al. 199f 
Stanford et al. 1998; Ferreras et al. 1999; Schade et al. 199^ 
Menanteau et al. 2001; Kuntschner et al. 2002; Treu et alj 

2002; Pozzetti et al. 2003; van de Ven et al 2003; .Bell et al 

2004b; iForster S chreiber et all 120041: iLabbe et alJ l2005h 
and gravitationally lensed objects (e.g., iRusin et al] l2003t 
Rusin & Kochanek 2005). These all indicate typical 
formation redshifts z ~ 1.5-2.5, with a lar ge range of 
formation redshifts Az ~ 1.5-2.0 dTreu et a LlGOOU l20Q2l 
van de Ven et'al]l2003t ICross et alJl2004l: iRus in & KochaneM 
and subsequent passive evolution of reddening 
remnant ellipticals. Although semi-analytical models of hier- 
archical galaxy formation reproduce this as a general trend in 
the star formation history of the Universe, recent results by 
iMenci et alJ (120051) . which attempt to reproduce the observed 
bimodal color distribution of galaxies, predict that red 
galaxies formed only in dense environments, underpredicting 
the relative red field galaxy population and the number of 
faint red galaxies. Furthermore this semi-analytical modeling 
predicts that red gal axies form at much too high a redshift, 
z 4 - 5 . Explicitly, Daddi et all (12004 find that the number 
density of massive spheroids which are forming and should 
appear as highly disturbed starbursting galaxies at z ~ 2 
is underpredicted by a factor of at least ~ 30 by current 
semi-analytical models. 
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Fig . 22 . — Predicted fraction of spheroids of a given total stellar mass (as labeled) with ages less then either 0.5 Gyr (upper panels) or 1 .0 Gyr (lower panels), 
as a function of redshift. Left panels assume pure density evolution in the quasar population above z ~ 2, right panels assume pure peak luminosity evolution. 



A key ingredi ent in resolving this d iscrepancy is clear from 
the results of Spri ngel et alJ (|2005^, who show that feed- 
back from black hole growth and quasar activity is critical 
in rapidly terminating star formation, allowing the produc- 
tion of quiescent red ellipticals even from mergers of rela- 
tively low-mass (faint) objects at much lower redshifts and 
explaining the observations of more recent formation redshifts 
z ~ 2. Furthermore, the presence of a massive black hole is 
also important in maintaining continued reddening of the el- 
liptical, as feedback from residual accretion can re-heat the 
gas, suppressing further star formation after the merger. This 
is also suggested directly by the compariso n between lumi - 
nosity functions and the modeling of Nagamine et al.' (l2i)QlJ), 
iMenci et al. ( 20(3, and|Oranato et al. ( 2004) in Fontana et al. 
(|2004), who show that these models under or over-predict the 
bright luminosity function at high redshift, but that AGN feed- 
back can regulate the slope of the galaxy stellar mass function 
at low masses. It is also important to note that even those mod- 
els which incorpora te black hole growth and feedback (e.g. 
iGranato et alJl2004l) must properly model the quasar lifetime 
and its dependence on luminosity ( Hopkins et al. 2005c e) in 
order to simultaneously reproduce the quasar and red galaxy 
luminosity functions and other properties in any picture of 
merger-driven AGN activity, as we demonstrate in Figures |4l 
[niand[l7l 

8. CONCLUSIONS 

Here, we have considered the consequences of a merger- 
driven scenario for the joint formation of spheroids, quasars, 
and relic supermassive black holes for the population of 
red galaxies. As we demonstrate elsewhere, the remnant 



spheroid hot X-ray emitting gas properties iCox et alJ2005a^ ■ 
morphologies (Cox et al. 2005b, in preparation), metal- 
licities (Cox et al. 2005c, in preparation), Mbh - cr rela- 
tion (Di Matteo et al. 2005), fine structure (e.g. Hernquist & 
Spergel 1992), and fundamental plane relations (Robertson 
et al. 2005c, in preparation) agree with observations. The 
expulsion of gas in these final stages of black hole growth 
is violent, and leaves a gas-poor remnant, with most of the 
remaining gas heated to virial X-ray emitting temperatures 
and effectively terminating star formation. This produces 
the observed red, elliptical galaxy population in the bimodal 
color/morphology distribution of galaxies, explaining the bi- 
modality seen at low and moderate redshifts with quasar feed- 
back providing the necessary means of quickly moving galax- 
ies from the "blue" evolutionary sequency (with continual star 
formation) to the "red" sequence (with negligible ongoing star 
formation) (Springel et al. 2005a). 

We use our model of quasar lifetime and evolution in merg- 
ers derived from simulations to de-convolve the observed 
quasar luminosity function and determine the rate of forma- 
tion of black holes of a given final mass as a function of black 
hole mass and redshift. Identifying quasar activity with the 
formation of spheroids in the framework of the merger hy- 
pothesis of hierarchical theories of galaxy formation, we then 
determine the corresponding rate of formation of spheroids 
with given properties as a function of redshift. 

We predict the distributions of galaxy velocity dispersions, 
the galaxy mass function, mass density, and star formation 
rate, the luminosity function in many observed wavebands 
(e.g., NUV, U, B, V, R, r, I, J, H, K), the total number density 
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and luminosity density of galaxies, the distribution of colors 
as a function of magnitude for several different wavebands, 
the distribution of colors as a function of velocity dispersion, 
the distribution of mass to light ratios as a function of mass, 
the luminosity-size relations, and the typical ages and distri- 
bution of ages (formation redshifts) as a function of mass, ve- 
locity dispersion, and luminosity. For each of these quantities, 
we predict the evolution from redshift z = - 6, although at 
high redshifts z > 2, our modeling suffers from the degener- 
acy between pure peak luminosity evolution and pure density 
evolution in the observed quasar luminosity function. Still, 
our results agree well with observations over a wide range of 
redshifts. 

Many of these predicted quantities, including the col- 
ors, mass-to-light ratios, and luminosity-size relations of 
spheroids, are essentially probes of the distribution of ages as 
a function of spheroid mass. However, this does not mean that 
they are trivially related, as they manifest a different depen- 
dence on subsequent star formation, structural galaxy scalings 
(e.g. Msph-Mvii orMsph-/?e relations), and dispersion in age 
as a function of different variables (as for example we have 
shown that the dispersion in colors and ages is different as a 
function of luminosity, mass, and velocity dispersion). Fur- 
thermore, if effects such as dry merging or metallicity scaling 
with stellar mass were not, as we have demonstrated, second 
order effects, they would break the implicit self-consistency 
of these quantities. Most important, different samples which 
probe e.g. different mass ranges, environments, sample sizes, 
and redshifts (and, correspondingly, have different systematic 
effects and biases) measure different quantities and constrain 
age distributions by these different methods, and therefore it 
is important to compare to the complete range of such obser- 
vations rather than one particular choice. 

Our results tie together the observed red, elliptical galaxy 
population and the quasar and relic supermassive black hole 
populations. With our modeling of quasar and merger activ- 
ity derived from hydrodynamical simulations, we have shown 
that the diverse set of galaxy observations listed above can be 
predicted directly from the observed quasar luminosity func- 
tion. We have demonstrated that the quasar luminosity func- 
tion implies the properties of the red galaxy population and 
their evolution with redshift, providing compelling evidence 
that spheroid and quasar formation must be driven by the same 
process of galaxy merging. 

Our methodology depends only on the form of the quasar 
lifetime as a function of peak luminosity, and simple scaling 
relations between black hole and galaxy properties such as the 
Mbh - f relation. Our simulations reproduce these scalings, 
independent of a wide range of host galaxy properties includ- 
ing gas fractions, presence or absence of bulges, initial black 
hole masses, ISM gas equation of state, galaxy orbital param- 
eters, and virial velocities. For example, we have varied the 
mass ratio of the merging galaxies and find that these scalings 
are unchanged between simulations with mass ratios of 1:1 , 
2:1, 3:1, and 5:1. We demonstrate in 'Hopk ins et all (l2Q05fl) 
that the scaling of quasar lifetime with luminosity and peak 
luminosity can be understood as a consequence of black hole 
self-regulation. Thus, as long as black holes still self-regulate 
in a manner which preserves observed relations, we expect 
these scalings to be robust with respect to mass ratios and the 
merger parameters listed above. 

The independence of these scalings, expressed in this man- 
ner, has the advantage that it allows us to relate and predict 
the properties of the quasar, black hole, and spheroid popu- 



lations independent of a complete cosmological framework. 
Our approach thus allows us to determine, without introduc- 
ing tunable parameters or additional uncertainty regarding de- 
tailed cosmological distributions, whether the merger hypoth- 
esis and the joint formation of spheroids and supermassive 
black holes in a quasar phase in major mergers are simul- 
taneously consistent with quasar and spheroid observations. 
Furthermore, it allows us to constrain the underlying cosmo- 
logical rate of creation or formation of spheroids and quasars 
in major mergers as a function of e.g. quasar peak luminos- 
ity or spheroid mass. These constraints appear to be consis- 
tent with observational estimates of merging galaxy luminos- 
ity functions (Hopkins et al. 2005g, in preparation), as these 
have a well-defined peak and turnover corres ponding to that 
predicted in e.g. our «'(Msph) distribution (e.g.. lXu et alJ2004l 
Wolfet al. 2005). 

Our detailed results for individual galaxy mergers and con- 
straints on the formation rates of spheroids can be combined 
with and used to test cosmological models, but we caution 
against too direct a comparison of predicted merger rates 
with the constraints from our modeling, at least presently. 
The mergers which produce quasars and spheroids, and are 
therefore of interest to and constrained by our modeling, 
are mergers not just of halos, but halos that host galaxies, 
and where the galaxies themselves have comparable masses 
and large reservoirs of cold gas, and will themselves merge 
in a Hubble time. There are certainly sufficient halo-halo 
major mergers in the standard CDM cosmology to explain 
the galaxy merger rates we infer; for example, the calcula- 
tions of e.g. Kauffmann & Haehnelt (2000); IWvithe & Lo"ebl 
(2003); Granato et al. (2004) show that there are more than 
enough major mergers at all masses to account for observed 
quasars with a one-to-one correspondence between quasars 
and ongoing halo mergers, even with a short quasar lifetime 
df/dlogL ^ 10^ yr (much shorter than the quasar lifetime we 
calculate for luminosities below the break in the observed lu- 
minosity function). However, cosmological simulations do 
not yet have the resolution to determine the rates and proper- 
ties of such mergers, let alone the gas physics of star forma- 
tion and black hole accretion and feedback. Semi-analytical 
models do not calculate the physics of these processes in a 
self-consistent manner, and must adopt a number of assump- 
tions about merger properties which introduce considerable 
uncertainty (and allow considerable fine-tuning) in the pre- 
dictions of the rates and effects of such mergers. Still, ideally, 
our results can be combined with such approaches in a manner 
which greatly increases their effective dynamic range, even- 
tually enabling an a priori prediction of the relevant merger 
rates and quasar and spheroid properties from a fully theoret- 
ical framework. 

The merger hypothesis presented by Toomre (1977) met 
with a great deal of skepticism, much of which persists nearly 
30 years later. However, many of the objections to Toomre's 
proposal owe to an inappropriate comparison between the 
properties of interacting galaxies seen locally, and those of 
large ellipticals which, in our model, formed when the Uni- 
verse was only a small fraction of its present age. For ex- 
ample, Ostriker (1980) argued that ellipticals could not form 
in the manner suggested by Toomre because ellipticals are 
more concentrated than disks of local spirals. This viewpoint 
can be expressed most neatly in terms of phase space densi- 
ties: ellipticals have higher central phase space densities than 
disks of local spirals and because, according to Liouville's 
Theorem, phase space density is conserved during a colli- 
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sionless process, mergers between disks cannot explain the 
high phase space density of ellipticals (Carlberg 1986, Gunn 
1987). N-body simulations show that this is indeed the case 
(e.g. Barnes 1988, 1992; Hernquist 1992, 1993a), but this ar- 
gument is flawed when applied to the merger hypothesis in at 
least two ways. First, disks at high redshifts were likely more 
compact than their counterparts in the local Universe. Sec- 
ond, and perhaps more important, disks at z > 1 were almost 
certainly more gas-rich than those of local spirals. As em- 
phasized by e.g. Lake (1989), Liouville's Theorem does not 
apply to mergers involving gas-rich galaxies because gas can 
radiate energy. 

Previous efforts to include gas dissipation in galaxy merg- 
ers, such as those of, e.g., Hernquist (1989), Barnes & Hern- 
quist (1991, 1996) and Mihos & Hernquist (1996) were re- 
stricted to cases where the progenitor galaxies were ^ 10% 
gas because the ISM was modeled as a single-phase, isother- 
mal medium. However, based on simulations and simple 
physical arguments, Hernquist et al. (1993) estimated that 
remnants of disk mergers would have a sufficiently high phase 
space density to explain central properties of ellipticals only 
for progenitor gas fractions > 25-30%. More complex, more 
realistic treatments of the ISM as a multiphase medium (e.g. 
Springel & Hernquist 2003a) now make it possible to con- 
struct disks with much larger gas fractions that do not vio- 
late the Toomre (1964) stability criterion (see, e.g.. Fig. 6 of 
Springel et al. 2005b). 

The simulations used in the present study, which employ 
galaxies with larger gas fractions than in earlier works and 
with galaxy structure reflecting cosmic evolution, show that 
mergers can, in fact, account for observed properties of ellip- 
ticals. Furthermore, by incorporating black hole growth and 
feedback into the simulations, we have demonstrated that the 
various processes attending a gas-rich merger can explain a 
much broader class of phenomena than Toomre's (1977) orig- 
inal hypothesis. Indeed, it is a remarkable fact that the critical 
gas fraction suggested by Hernquist et al. (1993) to overcome 
the phase space density problem is similar to that required for 
mergers to produce AGN with luminosities matching those of 
bright quasars at z 2 as well as reproducing observed kine- 
matic and structural properties of ellipticals that have been 
puzzling up to now. For example, as we show in Cox et al. 
(2005b, in preparation), the observed distribution of projected 
misalignments between spin and minor axes of ellipticals is 
naturally reproduced by our models if the gas fraction is large 
enough, which is not true for mergers between gas-poor spi- 
rals. These gas fractions are appropriate for the redshifts of 
formation we have determined here, with most large ellip- 
ticals building up their mass at moderate to high redshifts 
z ^ 1.5-2.5, and subsequent mergers primarily "dry" or col- 
lisionless. These various lines of evidence all support the pic- 
ture that quasars and ellipticals originated through the same 
process; mergers between gas-rich galaxies. 

Semi-analytical models in which inter actions and galaxy 
mergers fuel starburst activity (e.g, 'C ole et all 1200(1 
l^merville et al. 2001; Menci et al. 2004 a nd cosmo- 
logica l hydrodynam ical simulations (e.g., Dave et al. 
20021 iNagarnineet al. 2004b. 20Q5ajj; |Night et al. 2005; 
Finlator et alJ 12005ft have improved our understanding of 
galaxy formation and evolution, reproducing the properties 
of the cumulative galaxy population and explaining the 
tendency of larger galaxies to be redder and older as a 
natural consequence of hierarchical growth scenarios. Such 
modeling may even be able to account for bimodality in the 



z < 1 - 2 galaxy color distribution jMenci et alJl2005l) . with 
red galaxies formed in dense environments at high redshifts 
z ~ 4-5, with several early merging events and interactions 
ce asing at later r e dshifts in these environments. However, 
as Springel et al. (2005a) and this work make clear, these 
models mus t incorporate feedback from AGN activity (as in, 
e.g. Grana to et alJ l2004l) and the corresponding very rapid 
expulsion of gas and quenching of star formation in mergers 
to explain the formation of red spheroids at much later times 
1.5-2, as the bulk of observations suggest (see §0, as 
well as the significant faint population of such objects and 
their field population, as most observations find very little de- 
pendence on environment in the red galaxy population at fixed 
mass or luminosi ty iBlanton et al. 2003; Baloah et al. 2005 
Ho22 et al. 2004). Feedback from starburst-driven winds 
and AGN may also be critical in suppressing excessive early 
formation of low -mass spheroids (e.g. Granato et al. 20Q% 
ISilva et all 120051) . in order to explain the anti-hierarchical 
growth of spheroid and black hole mass implied by the 
quasar luminosity function. A proper accounting of the 
luminosity dependence of the quasar lifetime shows that the 
anti-hierarchical "downsizing" seen in both spheroid and 
quasar evolution is completely self-consistent, which is not 
the case if this dependence is ignored. 

Also, unlike these and other previous galaxy evolution 
models, we are able to specifically predict the properties of 
the red/spheroid population, and do so without the addition of 
new tunable parameters. The input physics of our simulations 
and modeling is already strongly constrained by an extensive 
range of observations of quasar properties (Di Matteo et al. 
2005; Robertson et al. 2005b; Hopkins et al. 2005a-e), essen- 
tially fixing our model, at which point the only essential ob- 
servational input is the observed quasar luminosity function. 
Our predictions demonstrate that the observed properties of 
quasars provide powerful constraints on the spheroid popu- 
lation, and likewise that spheroid observations can strongly 
constrain quasar evolution, especially at low luminosity and 
high redshift where direct observations are difficult. We fur- 
ther demonstrate that these predictions are skewed by sev- 
eral orders of magnitude if we adopt idealized models of 
the quasar lifetime in which quasars turn "on"/"off" or fol- 
low exponential light curves, instead of the more complicated 
quasar evolution we have studied in our simulations, demon- 
strating that it is not possible to reconcile the quasar and 
spheroid galaxy luminosity functions or spheroid ages, colors, 
or mass-to-light ratios in models of joint AGN-spheroid for- 
mation without accounting for luminosity-dependent quasar 
lifetimes. As a result, previous attempts to infer the prop- 
erties of the sphe roid population from the quasar luminos- 
ity function (e.g.. iMerloni et all 1200 41). although providing 
strong evidence of general co-evolution, have been forced to 
invoke evolution in e.g. the Mbh - A^sph relation to explain in- 
tegrated properties of the spheroid population and could not 
predict e.g. spheroid luminosity functions, whereas the ap- 
plication of more realistic quasar lifetimes immediately re- 
solves these difficulties. Any modeling which attempts to si- 
multaneously reproduce the properties of quasars and galax- 
ies (e.g., Kauffmann & Haehnelt 2000; Volonteri et al. 20M 
|Wvithe&Loeb 2003; Granato et al. 2004), specifically the 
remnant spheroid population, with AGN activity triggered in 
interactions and mergers must account for the effects of feed- 
back and gas physics on quasar evolution, and in particular 
must account for the non-trivial, luminosity-dependent nature 
of the quasar Ufetime. 
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We provide a large number of new predictions of the evo- 
lution of red galaxy properties with redshift, for comparison 
with future observations, which can be used to test this model 
and refine our understanding of joint spheroid and AGN for- 
mation. Our modeling also motivates observations of e.g. the 
ages, mass-to-light ratios, and colors of low-mass/luminosity 
galaxies at z = - 1 to strongly constrain whether pure lumi- 
nosity or pure density evolution occurs in the quasar/spheroid 
population at high redshift, where direct observations are in- 
accessible. These observations can also constrain the shape 
of the faint-end peak luminosity distribution, i.e. the low 
mass slope of the rate at which quasars of a given final 
black hole mass form (directly related to the remnant spheroid 
properties), where observations of e.g. the qua sar luminosity 
functio n provide only very weak constraints (iHopkins et al.l 
|2005i. 
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